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RECOGNITION AND NUMBER OF INCORRECT 
ALTERNATIVES PRESENTED 
DURING LEARNING’ 


BENTON J. UNDERWOOD; MILES PATTERSON; aw» JOEL 8. FREUND 
Northwestern. University 


A theory of recognition memory has been 
advanced which assumes that in the usual 
or typical recognition task with verbal 
units, performance is largely determined by 
a frequency differential (Underwood & 
Freund, 1970). In a typical experiment, a 
subject is given a series of units for a single 
study trial following which testing is ac- 
complished by mixing these old units with a 
series of new units and the subject is asked 
to identify the old. The theory assumes that 
old units have a situational frequency of 1, 
the new units a situational frequency of 0. 
The degree to which a subject can discrimi- 
nate this frequency difference will deter- 
mine his performance. The implication of 
the theory is, perhaps, most clearly seen in 
the multiple-choice type of recognition test. 


1 This work was supported by Contract N00014- 
67-A-0356-0010, Project NR 154-057, between 
Northwestern University and the Office of Naval 
Research. Reproduction in whole or in part is per- 
mitted for any purpose of the United States Gov- 
ernment. 

? Requests for reprints should be sent to Benton 
J. Underwood, Department of Psychology, North- 
western University, Evanston, Illinois 60201. 

* Now at the University of Missouri-St. Louis. 
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If each old item is paired with a new item, 
the subject may make a direct frequency 
comparison of the two. Of course, there may 
be two or more new items accompanying 
each old item. With three or four such items 
present, the test structure for each old word 
is much the same as for multiple-choice 
tests used so frequently to assess academic 
performance. Indeed, the present experi- 
ments came about as a consequence of view- 
ing multiple-choice tests from the perspec- 
tive of frequency theory, albeit the final 
procedure adopted was not intended to sim- 
ulate these tests. 

The learning task consisted of 50 unre- 
lated words. The test for recognition mem- 
ory consisted of 50 sets of five words each 
with each set containing one correct word 
and four incorrect words. The experimental 
variable was the number of these incorrect 
words which had been present at the time 
subject was attempting to learn the correct 
word. Let the five words in a given test set 
be identified as A, B, C, D, and E, with A 
the correct word and the others incorrect. 
The variable, then, was the number of the 
four incorrect responses which were also 
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present during the learning stage. In Condi- 
tion 0, none of the incorrect alternatives 
was presented during learning, hence, A was 
presented alone. In Condition 1, one of the 
alternatives was presented during learning, 
and so on. For Condition 4, it can be seen, 
all incorrect alternatives were presented 
during learning. All five words in a set al- 
ways occurred during the test phase. 

The expectations from frequency theory 
may now be considered. Suppose that dur- 
ing learning, the subjects has two incorrect 
responses (B and C) presented along with 
the correct word (A). These are presented 
successively as follows: B, A, C, A. As each 
word appears, the subject pronounces it and 
the appearance of A the second time (it is 
also underlined at its second presentation) 
provides the information that A is correct. 
At the time of the test, the subject is shown 
A, B, C, D, and E and is asked to select the 
correct word, For this particular case, three 
different frequency levels exist at the time 
of the test. The correct word, A, has had 
two frequency inputs or units, B and C 
have each had one, and D and E are new, 
hence have zero situational frequencies. If 
the subject can discriminate between fre- 
quency values of 2 and 1, he will be correct 
by choosing the unit with the higher fre- 
quency. If he cannot discriminate between 2 
and 1, but still uses the frequency attribute 
to determine his selection, the error is more 
likely to result from a choice of B or C, 
rather than D or E, since the former two 
each have one unit of situational frequency. 
To say this in more general terms: If an 
error is made, it is most likely to be made 
by choosing à word which was present dur- 
ing the learning phase in spite of the fact 
that the subject was given indirect informa- 
tion that it was not correct. 

Consider next the case where A is pre- 
sented alone during the learning phase, The 
subject sees A, A, and, therefore, the fre- 
quency is 2. At the time of the test, A will 
occur with four additional words all of 
which have a frequency of 0. If recognition 
is based on a frequency differential, the dis- 
crimination will be 2 versus 0. The predic- 
tion must follow then that recognition per- 
formance will be better when no incorrect 
alternatives are presented during learning 


than when one or more incorrect alterna, 
tives are presented. 

As examined thus far, the theory leads tg 
two rather direct predictions, namely, Cor: 
dition 0 will result in performance that i 
superior to the performance of the othe 
four conditions, and that for these latte 
four conditions, when an error is made it 
far more likely to result from the choice ol 
an alternative presented during learning 
than from the choice of an alternative not 
presented. The final prediction to be consid? 
ered deals with possible differences in per 
formance among the four conditions i] 
which from one to four incorrect alterna 
tives were presented during learning. The 
reasoning is somewhat indirect and requi 
three steps. 

As the first step, it is necessary to con: 
sider the possible outcome of an experiment) 
in which no incorrect alternatives are pre-} 
sented during learning (as in Condition 0 
but in which testing takes place by having 
one, two, three, or four alternatives. With 
one new alternative, the subject is faced 
with a choice between an old item (pres 
sented twice) and one new item. From 
Hintzman's (1969) work it is known that 
subject will choose the old item approxi 
mately 90% of the time when asked 10 
choose the word with the highest frequency. 
This means that for a small proportion of 
the pairs, the subjects cannot discriminate” 
between the old and new on the basis of fre 
quency. When an error is made it must 
mean that the apparent frequency of the 
item presented twice is far less than two 
(perhaps zero), that the apparent frequency 
of the new item is above zero, or some com 
bination of these two events. That a new 
word could have an apparent frequency 
greater than zero could result from a num- 
ber of factors which need not be of concern 
here. The critical point is that the apparent 
frequency of an item presented twice W | 
occasionally be less than the apparent fre- 
quency of a word not presented (a new 


word. Now, if the apparent frequency of 
single new word will sometimes be greatel 
than that of a word presented twice, what 
will happen as two, three, or four new words 
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are used? It seems proper to assume that 
the addition of each new alternative will 
inerease the probability that the apparent 
frequency of one of the new words will be 
greater than the apparent frequency of the 
old word (presented twice). Thus, although 
Hintzman (1969) found that the subject 
chose the most frequent word 9095 of the 
time when a single new alternative was pre- 
sented on the test, this value should be ap- 
preciably lower in the present Condition 0 
since four new alternatives are used on the 
test. 

‘As the second step, the effect of adding 
from one to four incorrect alternatives dur- 
ing the learning must be evaluated. Suppose 
that a subject is presented one incorrect al- 
ternative during learning and is tested by 
presenting only this incorrect alternative 
and the correct word. Hintzman’s (1969) 
data show that a subject will be correct 
67% of the time when he is asked to choose 
the word with the highest frequency. This is 
far less than when the incorrect alternative 
is new (as described above). If more than 
one incorrect alternative is presented during 
learning (and also used on the test), the 
number of errors should increase directly as 
the number of incorrect alternatives in- 
creases, This is to say that as the number of 
incorrect alternatives increases, there is an 
increase in the likelihood that the apparent 
frequency of one of these will be greater 
than the apparent frequency of the correct 
word. Thus, an expectation of change in 
performance as alternatives are added is 
the same whether new items (not previously 
presented) are added, or whether items pre- 
sented once are added. The difference lies 
only in the base error rate produced by a 
single wrong alternative that had or had 
not been presented during the learning 
phase. Given this base difference, an in- 
erease in errors produced by having two, 
three, or four incorrect alternatives should 
occur whether the additions consist of new 
items or whether they consist of items pre- 
sented once. 

As the final step, the conditions of the 
present experiment may be examined in the 
light of the above assumptions. The five 
conditions as they are presumed to differ at 
the time of the test are as follows: 


Frequency of Five 


Alternatives 
Condition 0 2 0 
Condition 1 2 1 0! 07707 0 
Condition 2 2 4l 10.700810 
Condition 3 gintdiyge uote WA 
Condition 4 25. I5 Den soll yells is 


The value of 2 represents the frequency of 
the correct item, 1 the frequency of incor- 
rect items presented during learning, and 0 
the frequency of incorrect items not pre- 
sented during learning. As described earlier, 
it is expected that Condition 0 will be supe- 
rior to Conditions 1-4. On the surface it 
may appear that performance would deter- 
iorate as the number of ones increases. How- 
ever, with each increase in the number of 


ones, there is a reciprocal decrease in num- , 


ber of zeros. Therefore, in proceeding from 
Condition 1 to Condition 4 there is an in- 
crease in one source of error but a corre- 
sponding decrease in errors from another 
source. If the rates of increase and decrease 
are equal, total errors should not differ 
among the four conditions, although the 
theory doesn’t specify such equality. 

It should be noted that the theoretical 
expectations were reached by assuming that 
recognition is mediated primarily by a fre- 
quency discrimination. Evidence from fre- 
quency discrimination studies was used in 
arriving at the predictions. These predic- 
tions appear to have been confirmed in an 
experiment done by Kaess and Zeaman 
(1960). The subjects in the Kaess-Zeaman 
experiment were given a 30-item multiple- 
choice test dealing with definitions of psy- 
chological terms. On the first trial there 
were zero to four incorrect alternatives 
present. The subject discovered the correct 
response by inserting a punch into the an- 
swer sheet. On the second trial, all questions 
had five alternatives. The results show that 
performance was best if no incorrect alter- 
natives were present on the first trial, but 
there was little difference for the conditions 
having one to four incorrect alternatives. 
Furthermore, the investigators showed that 
the errors made on the second trial were 
most likely to be made by choosing an al- 
ternative that had also been chosen on the 
first trial. Thus, these results seem quite in 
line with the theoretical expectations from 
frequency theory. However, precise knowl- 


go 
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edge of the input frequency of the correct 
and incorrect responses on the first trial is 
lacking. Also, since a given subject had only 
one condition for all 30 items, the total 
number of different terms to which subjects 
were exposed differed across conditions. 
These two points are not criticisms of the 
Kaess-Zeaman study, per se, but they are 
points which cause doubts as to whether the 
data can be taken as adequate tests of the 
frequency theory as applied to multiple- 
choice learning. 

The experiments to be reported examined 
one further variable. Disregarding fre- 
quency theory for the moment, it may be 
asked how much the information about 
right and wrong items is dependent upon 
the particular items constituting a set, that 
is, the items forming each multiple-choice 
question. Does knowledge of a correct item 
depend upon a contrast of “wrongness” for 
the other items in the set? Or to ask the 
question in more general terms: Is knowl- 
edge of right and wrong contingent upon the 
context of each set? Frequency theory 
makes no assumption about this issue. The 
theory asserts that frequency differentials 
are dominant in mediating recognition per- 
formance, and this would be true regardless 
of a change in the context as defined by a 
set of words. In Experiment I, the context 
of a set remained constant between learning 
and testing. In Experiment II, the context 
changed in that incorrect alternatives oc- 
curring with a given correct word during 
learning appeared with a different correct 
word during testing. 


Meon 
Experiment I 
Materials 


A total of 250 two-syllable words with fre- 
quencies between 1 and 10 were chosen from 
Thorndike and Lorge (1944). This pool was divided 
randomly into 50 sets of five words each. As a 
further step, one of the five words was randomly 
chosen to be the correct word and it remained cor- 
rect for all conditions. The 50 items for a given 
subject consisted of 10 fitting each of the five con- 
ditions, namely, zero, one, two, three, or four 
incorrect alternatives presented during learning. 
However, five forms were used, so that across all 
five forms each of the 50 correct items occurred 
once under each of the five conditions. The posi- 
tion of a given item in the learning series was the 


same across all forms; it differed only in terms of 
the number of incorrect alternatives presented, 
The order of the 50 items was random subject to 
the restriction that an item in each condition oc- 
cur in each five-item block. A different random 
order was used on the test form, this test form 
being exactly the same for all subjects. A test item 
always consisted of the correct word and four in- 
correct words randomly ordered. 


Procedure and Subjects 


The subjects were fully informed concerning the 
nature of the learning phase and how they were 
to be tested. The instructions included the use of 
a sample card to illustrate how the items would 
be presented and how the number of items in the 
sets would differ during the learning phase. Presen- 
tation was by a memory drum at a 2-second rate, 
Each word was presented individually at this rate 
and the subject was required to pronounce each 
word aloud as it appeared. After the last alterna- 
tive appeared, the correct alternative was shown 
again, underlined, and the subject pronounced it 
for the second time. At this point it was assumed 
that all incorrect alternatives had frequencies of 1, 
and the correct alternative a frequency of 2. If no 
incorrect alternative was presented (Condition 0), 
the correct word followed itself, and the subject 
pronounced it for the second time. Following the 
second appearance of the correct item for a given 
set, an asterisk was shown for 2 seconds signifying 
that the words from a new set would appear next. 

On the unpaced recognition test, a subject went 
through the 50 sets of five words each, circling the 
correct word in each set. No omissions were al- 
lowed. There were 25 sets on each of two pages 
and the subject was required to complete the first 
page before going to the second. 

A total of 100 college students was used as 
subjects, 20 being assigned to each form by à 
block-randomized schedule. 


Experiment II 


One change was made in Experiment II. The 
test form was modified &o that incorrect alterna- 
tives appearing with a given correct word during 
learning never occurred with that correct word on 
the recognition test. Rather, they appeared with a 
different correct word, Five different test forms 
were used although the list presented for learning 
was the same as for Experiment I. The interchange 
between items was always made for items having 
the same number of incorrect, alternatives during 
learning. Condition 0, in which no incorrect alter- 
natives were used during learning, was exactly the 
same in both experiments. Again, 100 subjects were 
used, 20 for each of the five forms. 


RzsurTS 
Number of Errors 


The mean numbers of errors (out of 10 
possible) as a function of number of incor- 
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rect alternatives presented during learning 
are shown in Figure 1. Both experiments 
show a sharp increase in error frequency 
between 0 and 1 incorrect alternatives, with 
little change thereafter. For Experiment I, 
F = 1739, df = 4/380, p < .01; for Experi- 
ment II, F = 15.69, p < .01. Between one 
and four incorrect alternatives for Experi- 
ment I, there is a slight upward slope to the 
error eurve. However, & test of these four 
points does not allow rejection of the hy- 
pothesis that the slope is zero (F < 1). The 
first theoretical expectation, therefore, ap- 
pears confirmed; best performance was ob- 
served for an item when no incorrect alter- 
natives were present during learning. The 
data also indicate that given at least one 
incorrect alternative during learning, per- 
formance does not change as the number is 
increased. 

There is consistently greater number of 
errors for Experiment II than for Experi- 
ment I. However, statistieally the overall 
difference does not reach significance (F — 
2.68, df = 1/198, p > .05). It seems appro- 
priate to conclude that whether an incorrect 
item had or had not appeared with a given 
correct item during learning does not influ- 
ence performance appreciably. 


Source of Errors 


The percentages of errors from each of 
two sources are plotted in Figure 2 for the 
two experiments combined. The two sources 
are: (a) those alternatives presented during 
learning, and (b) those not presented. Of 
course, with zero incorrect alternatives pre- 
sented during learning, all must arise from 
alternatives not presented. And, when four 
incorrect alternatives were presented during 
learning, all must arise from among those 
four. Therefore, to evaluate the expectation 
from the theory (that errors will arise 
largely from items presented during learn- 
ing) attention must be directed to the three 
conditions in which one, two, or three incor- 
rect alternatives were presented. When one 
incorrect alternative was presented during 
learning, 70% of the errors were the result 
of subjects choosing that alternative and 
3095 were the result of choosing one of the 
three alternatives not presented during 
learning. As may further be seen, when two 


MEAN 
F 


LJ 1 2 3 4 
NUMBER or INCORRECT 
ALTERNATIVES PRESENTED 

DURING LEARNING 
Fic. 1. Mean errors in recognition as as func- 
tion of the number of incorrect alternatives pre- 
sented during learning. 


incorrect alternatives were presented during 
learning, 85% of the errors resulted from a 
choice of one of these two alternatives. The 
chance likelihood of choosing an incorrect 
alternative presented during learning given 
an error is 25%, 50%, and 75% for one, two, 
and three alternatives, respectively. As can 
be seen, the empirical percentages are far 
above the chance percentages. 

The above data give strong support to the 
theoretical expectation that an error is 
more likely to be made by choosing an item 
presented during learning than by choosing 
one not presented. Yet, it is apparent that 
an item not presented during learning has 
some small probability of being chosen 
when an error is committed. As explained in 


Items presented (old) 


or 
TYPE 


items not prosentod (new) 


PERCENT 
ERROR 


o 1 2 3 4 
NUMBER OF INCORRECT 
ALTERNATIVES — PRESENTED 
DURINO LEARNING 


Fic. 2. Percentage of error types for old and 
new alternatives. 
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the introduction, this is believed to be due 
to variations in apparent frequency which 
occur with constant input frequency. The 
source of these variations are not under- 
stood at this time. 


Discussion 


The present results are in substantial ac- 

cord with the theory that word recognition 
memory is dominated by frequency infor- 
mation, To say that recognition memory is 
dominated by a frequency attribute does 
not deny that other attributes may enter 
into certain of the decisions subjects must 
make on a recognition test. Indeed, it is 
possible that the subject sometimes chooses 
a new word (not presented during learning) 
because of an overlap of other attributes 
between this word and some correct word in 
the list. The data show that errors do not 
increase as the number of incorrect alterna- 
tives presented during learning increases 
(Figure 1). This could be accounted for if it 
is assumed that there are only two sources 
of error, namely, from new items and from 
items presented during learning. In the 
present experiment the numbers of items of 
each type were reciprocal. Therefore, it ap- 
pears that the increase in errors produced 
by adding items that had been presented 
during learning was counteracted by a de- 
crease in errors resulting from the dropping 
out of items that had not occurred on the 
learning trial. If these two curves had the 
same slope (one positive and one negative) 
it would result in the performance showing 
No increase in error frequency as number of 
alternatives presented during learning in- 
creased beyond one. That the slopes of 
these two curves could be the same may 
seem intuitively unreasonable since neither 
an additive nor a strict probabilistic model 
would lead to this outcome. So, in detail the 
assumption may be wrong, although it 
seems to handle the present data. Ob- 
viously, to clarify this matter, data are 
needed on error increases as number of new 
alternatives increases and as number of al- 
ternatives presented during learning in- 
ereases, but, without mixing the two types 
at the time of the test. 

The findings are quite consistent with 
those reported by Kaess and Zeaman 
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(1960) as described in the introduction. On 
the other hand, the results of a recent study 
by Sturges (1969) would seem to be at odds 
with frequency theory. In the Sturges 
study, subjects were given a multiple-choice 
test covering certain facts from the social 
sciences. There were four alternatives. Fol- 
lowing the initial testing, one group was 
shown the stem and only the correct alter- 
native, while another group was shown the 
stem, the correct alternative and the three 
incorrect alternatives. A further testing 
showed that the two groups did not differ. 
According to frequency theory, the group 
shown only the correct alternative should 
have been superior. However, if the subjects 
in the group shown the correct and incorrect 
alternatives ignored the latter and studied 
only the correct alternative, the conditions 
for the two groups were effectively not dif- 
ferent. If a subject had been forced to read 
the incorrect alternatives it seems highly 
probable that the findings would have been 
different. 

The above discussion implies that fre- 
quency theory may have some applicability 
to multiple-choice testing as commonly car- 
ried out in schools. Murdock (1963) ana- 
lyzed recognition performance on a multi- 
ple-choice test of general information. He 
reached the conclusion that a subject ap- 
pears to eliminate incorrect alternatives 
and then (if more than one alternative re- 
mains) chooses randomly from among the 
remaining. This conclusion is not necessar- 
ily at odds with frequency theory. Fre- 
quency theory specifies the attribute of 
memory which allows a subject to distin- 
guish between correct and incorrect alterna- 
tives. Perfect recognition occurs when the 
apparent frequency of the correct alterna- 
tive is greater than the apparent frequency 
of each incorrect alternative. Errors occur 


when apparent frequency of one or more | 


incorrect alternatives is indistinguishable 
from the apparent frequency of the correct 
alternative. In these instances, it is be- 
lieved, performance will be above chance 
only insofar as other attributes of the mem- 
ory will reliably distinguish between the 
correct item and incorrect alternatives. 

The items in the present experiment did 
not include a "stem" or premise as is the 
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usual ease for multiple-choice items. Fre- 
. quency theory is applicable to the more 
usual ease if it is assumed that the frequen- 
cies of the alternatives have some degree of 
specificity to the stem. Or, to say this an- 
other way, the apparent frequencies are to 
some degree, at least, contingent frequen- 
cies. The lack of this contingency in the 
present studies allowed shifting of incorrect 
alternatives from one item to another with 
no appreciable influence on performance. 
However, there is no reason to believe that 
frequeney theory will not apply to the more 
usual ease where contingent frequencies are 
established. 

One final conjecture will be made. Appar- 
ent frequency of alternatives may change 
during the process of testing (Underwood & 
Freund, 1970). Consider a ease in which the 
frequency of the correct alternative is mar- 
ginally distinguishable from one or more in- 
correct alternatives. The subject may, at 
this point, “study” the various alternatives 
carefully in order to get additional informa- 
tion to help reach a decision. However, the 
act of gathering this information may in- 
crease the frequency of the alternatives to 
the point that they are no longer distin- 
guishable from. the correct alternative. 


Therefore, unless the additional informa- 
tion clearly leads to a correct decision, the 
decision can no longer be based upon a fre- 
quency differential. In short, poorer per- 
formance may result if too much time is 
spent in trying to arrive at a decision. 
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LISTENING AND NOTE TAKING* 
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Subjects listened to a set of three 65-minute passages. There were four 
orthogonally crossed variables: the position of the criterion passage on an 
imaginary scientific system in the set, note taking while listening, rehearsal 
immediately after listening, and testing. A free-recall test, which was scored 
for number of words and number of ideas, and a multiple-choice examina- 
tion were administered at the conclusion of the experiment. There were more 
words generated and higher multiple-choice test scores when the study 
interval was used for review than when it was used for other activities. 
The number of ideas recalled was favorably influenced by note taking, 
rehearsal, and testing. There were no significant effects due to position of the 
passage in the set. Post hoc analyses indicated significant correlations be- 
tween performance and the individual difference variables of anxiety and 
tolerance of ambiguity. A significant interaction between social desirability 
and performance was obtained for certain of the treatments. Implications for 


a minitheory of listening and note taking were indi 


Despite the relative lack of research on 
the topic of listening, it is readily apparent 
that one of the most prevalent “learning 
sets” for attempting to enhance one’s recall 
of the content of a lecture is to take written 
transcriptions of the material presented. 
Notes appear to serve either or both of two 
functions. As an external storage mecha- 
nism (Miller, Galanter, & Pribram, 1960) 
they can provide a resource for later study 
or reference by the learner. As an encoding 
mechanism they allow the learner to tran- 
scribe whatever subjective associations, in- 
ferences, and interpretations occurred to 
him while listening. In the extreme case, 
note taking which is used solely for the pur- 
poses of external storage can only be incom- 
patible with efficient learning. Such notes 
tend to be taken in mechanical fashion, 
they interfere with attention, and they may 
engender a feeling that the task has been 
accomplished (for the time being at least). 


*The research reported in this paper was 
ported by the Advanced Research Projects Agha 
(ARPA Order No. 1269) through the United States 
Office of Naval Research under Contract ONR 
Nonr N00014-67-A -0385-0005. 

+ Requests for reprints should be sent to Francis 
J. Di Vesta, Department of Educational Psychol- 
ogy, Pennsylvania State University, 311 Rackley 
Building, University Park, Pennsylvania 16802. 


icated. 


If the learner feels that a “good set of notes” 
is the equivalent of studying, he may by- 
pass review, rehearsal, or the simplest of 
transformational encoding. 

By our reasoning, the kind of note taking 
which serves a role in encoding should be 
much more efficient than one used only for 
external storage purposes. The behavior of 
the student employing encoding or other 
transformational processes reflects a trans- 


action between the learner and the material | 


to be learned, that is, it assumes or suggests 
an active learner. In a sense, the learner has 
taken the initiative necessary to put the 
material into long term store; through en- 
coding, the learner has linked the material 


to his existing cognitive structure—he has | 
made it meaningful. Prior to the conduct of | 


this experiment we assumed that without 


special training most students used notes | 


for external storage. Since this assumption 


was made with some hesitation it appeared | 
that, from the point of view of program- 


matic research on instructional strategy, the 
first investigations should be of the effects 
of taking notes in naturalistic settings OD 
learning and retention. If, on the one hand, 
note taking is found to interfere with recall 
then investigations must be conducted On 
how best to improve attention without 
notes, and how to develop learning sets re- 
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lated to efficient listening behavior. On the 
other hand, if note taking clearly is found 
to enhance learning and retention, investi- 
gations must be conducted to examine the 
conditions under which it facilitates learn- 
ing, that is, when and how notes are to be 
taken. 

The foregoing suggests the possibilities 
for research and application to instruction 
implied in a minitheory of listening and 
note taking. The objective in the present 
study was an initial step in what we hope 
will be a series of empirical investigations 
on study habits within the above frame- 
work, Specifically, the present study was 
conducted to determine the effect of note 
taking in conjunction with the opportunity 
to review the material learned on later re- 
call. In as much as prior research (Roth- 
kopf & Bisbicos, 1967) has suggested that 
testlike events following a communication 
can affect later recall of the message, the 
effects of testing, as another variable, were 
also examined. 


Metuop 


Design 


The subjects in this experiment listened to three 
5-minute passages. For each passage the overall 
procedure consisted of three segments: a 5-min- 
ute period in which the subjects listened to a 
recorded communication; a 5-minute interval; 
and a 83-minute testing period. Note taking was 
manipulated in the listening segment. During this 
period, half of the subjects were permitted to take 
notes while the message was being presented; 
the other half were not permitted to do so. Each 
of these two groups was further subdivided for 
the two treatments administered in the second 
interval during which the subject was either 
allowed to rehearse the communication (by us- 
ing his notes, or by contemplating whatever he 
could remember of the message) or, the subject 
was prevented from rehearsing (by requiring him 
to work on a spatial relations test). The final 
subdivision of groups was made in the third seg- 
ment of the procedure during which time half of 
the subjects within each of the groups mentioned 
above took a fill-in test on the material pre- 
sented; the other half worked on a spatial rela- 
tions test. This procedure was followed for each 
of three communications, only one of which was 
to be used for criterion purposes. At the con- 
clusion of the experiment a free-recall and a 
multiple-choice test were administered, in that 
order, on the contents of all passages, The re- 
sults for each set of data on the criterion passages 


were analyzed via a 2 X 2 X 2 X 3 factorial 
analysis of variance with two levels of note tak- 
ing (notes and no notes), two levels of rehearsal 
(rehearsal and no rehearsal), two levels of testing 
(test and no test), and the position (first, second, 
or third) of the criterion passage in the sequence 
of three communications. 


Subjects 


One hundred and twenty subjects were assigned 
randomly to one of the 24 experimental condi- 
tions with the restriction that an equal number 
(n = 5) of subjects be run in each condition, 
The subjects received credit toward their final 
grades in the course for their participation. No 
subject had participated previously in an experi- 
ment where connected discourses had been used. 


Materials 


A set of three 5-minute passages on different 
topics (hair seals, bow porcelain, and Xenograde 
systems) was taped for use as the communica- 
tion materials. Each passage contained 500 words. 
The information on hair seals and bow porcelain 
was taken from the Encyclopedia Americana 
(1963). The passage on Xenograde Systems was 
edited from the first chapter of material on An 
Imaginary Scientific System devised by Merrill 
(see for example, Merrill 1965a, 1965b) for ex- 
perimental purposes. All topics were sufficiently 
unique so as to be unknown to the subjects prior 
to the experiment. Since the beginning of each 
passage was marked on the recording tape, the 
order of presentation of the passages could be 
controlled by the experimenter. In those treat- 
ments where interpolated material was required, 
two spatial-ability tests were administered to the 
subjects: Flags: A Test of Space Thinking 
(Thurstone & Jeffrey, 1956) and A Space Rela- 
tions Test from the Differential Aptitude Test 
Battery (Bennett, Seashore, & Wesman, 1947). 

Only the results of the passage with a mean- 
ingful underlying theme (that is, the passage 
on the Xenograde System) were scored. The ma- 
terial on bow porcelain and hair seals was not 
analyzed because they were used primarily as a 
vehicle for manipulating position of the Xeno- 
grade passage within a set of passages. Further- 
more, an attempt to score the content contained 
in the two passages immediately indicated that 
the number of specifics such as dates, proper 
names, and esoteric labels made a criterion list 
of ideas unwieldy and the scoring unreliable. 

A battery of five tests, each tapping a differ- 
ent personality variable, had been administered 
to all subjects in a testing session previous to 
and independent of this experiment. Since these 
test scores were available they were used to investi- 
gate possible relationships between individual 
differences and performance on the tasks. The 
five tests were: the facilitating and debilitating 
anxiety subscales of the Achievement Amziety 
Scale (Alpert & Haber, 1960); Intolerance of Am- 
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biguity Scale (Budner, 1962); Social Desirability 
Scale (Crowne & Marlowe, 1964) ; The Dogmatism 
Scale (Rokeach, 1960); and The Internal-Ezternal 
Scale (locus of control) (Rotter, 1966). 


Procedure 


At least two and never more than four sub- 
jects participated in the experiment at one time. 
Each subject worked at an isolated station with 
barriers of masonite particle board between sta- 
tions to prevent sharing of information. The num- 
ber of subjects run at any one time was variable 
because, on occasion some subjects who were 
Scheduled for a given period failed to appear for 
the experiment. No two subjects were ever run 
in the same treatment condition at any one time. 
All subjects were assigned to their respective 
treatments at random when they arrived at the 
laboratory. 

After the subjects were seated at their stations 
the experimenter instructed them to read the in- 
structions silently. Questions were quietly an- 
swered whenever assistance was required. The in- 
structions stressed that each person would be 
doing something different during the experiment. 
All persons were informed that the experiment 
was designed to investigate how people learn new 
materials. Furthermore, all subjects were told be- 
forehand that the experiment would consist of 
three passages each with a sequence of three seg- 
ments plus a final test on the three topics. The 
subject was in the same experimental treatment 
for all passages. 

When the first 5-minute listening session was 
begun, the subjects in the note-taking treatment 
had been informed that they could take notes 
on the passages being presented. The subjects in 
the listening-only (no notes) treatment had been 
told only to listen to the passage. They were not 
permitted to take notes during that time. 

When the presentation of the communication 
was completed the second 5-minute interval be- 
gan, The subjects who were in the rehearsal con- 
dition were instructed that they were to use the 
5 minutes to review what they heard. The sub- 
jects in the no-rehearsal condition were told that 
they would spend the 5 minutes working on some 
other material. During this time the latter groups 
worked on the spatial relations test. 

After the 5-minute study period the final 3- 
minute interval was begun. In the testing treat- 
ment the subjects were given the short fill-in 
lest on the passage. The subjects in the no-test- 
ing treatment spent their time in this interval on 
the Mn eurete a riatione test. This procedure 
was repeat or ei of the two remaining 
passages. ee 

At the conclusion of the presentation of all 
passages, the subject was asked to write down 
everything that he could remember about each 
of the passages. When the subject was finished 
with this task the experimenter then administered 
three eight-item multiple-choice tests, one for each 
of the passages. The entire experiment including 


the final examination required about 1 hour to 
administer. 
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The free-recall test was scored for num- 
ber of words and number of ideas generated 
by each subject. The number of words gen- 
erated was scored as sheer volume of recall, 
Words, including articles, were counted for 
this score. The number of ideas generated 
were judged by two raters against a master 
list of ideas in the original passage. Inter- 
scorer reliability for this measure, based on 
20 scores, was .95 for the two scorers of the 
papers. A third measure of the subject’s 
performance was obtained from the number 
of correct items on the final eight-question 
multiple-choice test. As noted in the proce- 
dure section, the first two measures were 
scored only for the passage on Xenograde 
Systems. Each set of data was analyzed via 
a2 x 2x 2 x 3 factorial analysis of vari- 
ance in which the factors were note taking, 
rehearsal, test events, and position of the 
passage in the series. 

The analysis of number of words gener- 
ated yielded F = 3.77, df = 1/96, p = 06 
for the main effect due to the rehearsal 
treatment. When a 5-minute study period 
followed the listening period, a larger num- 
ber of words (X = 108.7) was produced 
than when the study interval was filled by 
activities unrelated to rehearsal of the pas- 
sage (X = 92.3). This result suggests one 
influence of rehearsal, as a mathemagenic 
activity intervening between the initial 
learning session and the recall task, on one 
measure of output. None of the other 
sources of variance (either main effects OT . 
interactions) was significant in this analy- 


sis. 

By themselves, the results of the first 
analysis do not indicate that achievement — 
or retention is necessarily affected by math- 
emagenic behaviors. They indicate only 
that rehearsal prompts the individual to 
write “more.” He may do so because the | 
demand characteristics of the experiment — 
have been made salient or because he has 
more knowledge about which he can write. - 
With regard to this point, the number of - 
ideas generated was possibly the most im- . 
portant measure employed in this experi- — 
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ment. It reflects both acquisition and reten- 
ion of material from listening to the passage 
and is more exhaustive of the information 
acquired by the subject than are any of the 
other measures. The data related to this 
measure are summarized for each experi- 
mental condition in Table 1. The analysis 
of variance of these data yielded F — 3.87, 
df = 1/96, p = .05 for the main effect due 
to note taking; F — 8.92, df — 1/96, p — 
.004 for the effect due to rehearsal; and F — 
11.58, df — 1/96, p — .001 for the effect due 
to testlike events. The effects of the position 
of the passage in the series and of the inter- 
actions were not found to be significant (p 
> .05). These results indicated that sub- 
jects who were permitted to take notes re- 
called significantly more ideas (X = 12.0) 
than did those subjects who were permitted 
only to listen (X = 10.6). The rehearsal 
period enhanced the ability of subjects to 
recall ideas (X = 12.4) when compared 
with a period of similar length filled with 
unrelated activities (X = 10.2). Finally, 
when the subjects had a test on the material 
immediately following the listening period, 
their performance (X = 12.5) excelled 
those subjects who worked on another test 
(X = 10.1). These data imply that math- 
emagenie behaviors have relatively direct 
effects on acquisition. The lack of a signifi- 
cant effect due to position suggests that the 
experimental treatment failed to develop 
learning sets which we assumed would be 
acquired by our subjects over the three lis- 
tening periods. 

The scores on the multiple-choice test, 
given at the end of the experimental session, 
were analyzed in a manner similar to that 
employed for the previously described de- 
pendent measures. Only the F = 8.99, d= 
1/96, for the main effect due to note taking 
was significant (p = .003) in this analysis. 
The subjects who were permitted to take 
notes earned higher scores (X = 62) on 
the multiple-choice test than did those sub- 
jects who merely listened (X — 5.5). While 
the effect of note taking is a reliable one, 
other effects may not have been isolated be- 
cause the test was so short thereby decreas- 
ing its reliability and affecting the represen- 
tativeness of sampling the content of the 
passage. 
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TABLE 1 


Mean NuMBER or IDEAS RECALLED BY SUBJECTS 
IN EACH EXPERIMENTAL TREATMENT 


Rehearsal No rehearsal 
u 
Treatment * 

se [Relted| cated HS related 

test est | test 

Listening only 1 |13.0 | 11.0 | 8.6] 8.6 
(no notes) 2 |14.2| 8.6 |11.4| 6.6 

3 | 12.4 | 13.2 | 12.2 | 7.2 

Overall X 18.2 | 10.9 | 10.7 | 7.5 
Overall SD 8.2| 5.1| 3.1| 8.9 
Listening and 1 | 14.2 | 13.2 | 11.6 | 9.2 
note taking 2 | 14.0] 12.8 | 14.4 | 10.0 
3 | 11.2] 10.6 | 13.0 | 9.8 

Overall X 13.1 | 12.2 | 13.0 | 9.7 
Overall SD 8.8| 4.4| 4.0| 3.4 


Correlations between the five personality 
variables and the dependent measure of 
number of ideas generated were calculated 
for all groups. The results of this analysis 
are summarized in Table 2, where signifi- 
cant relationships are marked with aster- 
isks. Post hoc analyses of two of these 
scales, Social Desirability and Dogmatism, 
were made to identify possible interactions 
with treatments. Preliminary inspection of 
the data in Table 2 indicated that by col- 
lapsing across the test-events treatments for 
subjects who did not take notes, a differen- 
tial relationship between social desirability 
scores and performance might be obtained 
for the two rehearsal treatments (n — 30 in 
each group). The pooled means and stand- 
ard deviations for the social desirability 
scores of the two groups (rehearsal treat- 
ment X — 49.07, SD — 5.65; no-rehearsal 
treatment X = 49.88, SD = 4.25) were 
highly similar and not significantly differ- 
ent (p > .05). However, social desirability 
was positively correlated (r — .50) with the 
performance of subjects in the rehearsal 
group and negatively correlated (r — —.14) 
with the performance of subjects in the no- 
rehearsal groups. The two correlation coef- 
ficients are significantly different via Fish- 
er’s z statistic for the difference between 
two correlations (z = 2.52, p < .01). The 
pooled correlation coefficient representing 
the relationship between Dogmatism (n = 
60) and subjects who had test events was 
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TABLE 2 


Summary TABLE or CORRELATION COEFFICIENTS BETWEEN INDIVIDUAL DIFFERENCE Scores AND 


NUMBER or IDEAS GENERATED FOR EACH TREATMENT GROUP IN THE EXPERIMENT* 


Individual difference Rehearsal 
variables 


Test events [No test events |Test events 


Debilitating 

anxiety .05 —.25 —.28 
Facilitating 

anxiety —.07 
Tolerance of 

ambiguity —.14 
Social desir- 

ability .04 
Dogmatism : 23 
Locus of control| 18 


*n = 15 in each treatment group. 
*p»« 05. 
**» < 10. 


73 while that between Dogmatism and 
subjects who had no test events was .19. 
The difference between these two coeffi- 
cients was not significant (p < .10). 


Discussion 


The results of the present study clearly 
demonstrate that student activities can be 
effectively manipulated through strategies 
that simulate instructor behaviors in natu- 
talistic settings. Those strategies which em- 
phasize note taking, immediate opportunity 
for review, and test events are efficient ones 
for the recall of main ideas acquired during 
listening to a presentation. Apparently 
these effects are not cumulative as implied 
by the absence of significant interactions. 
However, we strongly suspect that such in- 
teractions might be obtained if the length of 
the passages was to be increased, 

The findings concerning test events are 
supportive of those obtained by Rothkopf 
(1965) and Rothkopf and Bisbicos (1967) 
who also found questions after learning had 
a facilitating effect on retention of written 
material. The activities activated by. test 
events presumably increase the salience of 
certain ideas within a passage. If this is a 
correct assumption, we must also assume 
that selection of ideas is made from mate- 
rial already stored in memory or that the 


No rehearsal 


Notes 
Rehearsal No rehearsal 

No test events |Test events | (No... | Tent, | Notes 
—.07 —.24 —.28 

.29 — .38 .52* 

— .08 —.07 .54* 

.28 —.80 B1 

10 —.09 .25 

. .00 .26 


experimental instructions ereated expecta- 
tions which encouraged the subject to em- 
ploy efficient study methods. Since there has 
been no feedback or correction on the test 
event it cannot be assumed that the test 
functions as another practice trial. Similar 
explanations were offered by Chapman 
(1932) and Lawrence and Coles (1954) who 
noted that postinstructions influenced what 
is remembered in a perceptual task. 

lt is possible that the instructions and 
test events in this study indicated what 
must be attended to in subsequent, passages 
(that is, expectations or orienting habits 
were influenced by instructions and partici- 
pation in the task). However, if this were 80 
we would have found significant effects due 
to position of the passage. Another experi- 
ment is required which is designed, specifi- 
cally, to test the relative merits of the post- 
learning scanning hypothesis and the expec 
tation hypothesis. 

Learning increases following a rehearsal 
period. In itself, this is not a surprising con- 
clusion and supports results obtained in à 
number of other studies as well as common- 
sense observation. The typical explanation 
is that repetition of material learned during 
listening increases the habit strength of 
ideas acquired, or some similar notion. 
However, we also suggest that such & period 
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may provide an opportunity for consolida- 
tion. Hebb (1966), for example, notes that 
the complete effect of whatever takes place 
during learning comes to fruition only after 
& period of contemplation, that is, & period 
during which the learning can “set” or 
"gel" In this regard, the present authors 
know of no studies which have directly ex- 
amined the effects of a consolidation period 
in learning from connected discourse. 

With the exception of studies by Craw- 
ford (1925a, 1925b) the few early studies 
on note taking provided no convincing evi- 
dence that this activity was either benefi- 
cial or detrimental to learning while listen- 
ing. More recently Berliner (1970) found a 
significant effect of note taking when meas- 
ured by one form of a test but not with 
another. In the present experiment, taking 
notes clearly led to an increase in the num- 
ber of ideas recalled. Furthermore it was 
the only variable which elevated scores on 
the multiple-choice test. Instead of interfer- 
ing with learning, as originally hypothe- 
sized, note taking appears to sensitize the 
learner to certain aspects of the communi- 
cation. The transformation is one of acting 
on the incoming information, sifting out rel- 
evant material, and organizing important 
content which is then recorded by the 
learner. The increased attention given to 
these concepts while taking notes increases 
the probability that the concepts will be 
retrieved even though there is little chance 
to review the notes immediately after stud- 
ying. 

The significant differences in correlations 
between social desirability and number of 
ideas generated under the two rehearsal 
treatments suggest some interesting possi- 
bilities for further studies. The rehearsal 
period was necessarily a mental rehearsal 
period (no notes were permitted these sub- 
jects) and the experimenter had no means 
of enforcing rehearsal. Thus, subjects with 
a greater desire to please and conform so- 
cially presumably engaged in rehearsal be- 
cause it “was the thing to do" thereby re- 
sulting in better criterial performance. Sub- 
jects who had low social-desirability scores 
were unaffected or less affected by this 
treatment. This opportunity was lacking in 


the no-rehearsal group thus resulting in a 
near zero relationship. 

Individual differences in dogmatism also 
suggested an interaction with the testing 
treatments. When high dogmatic subjects 
experienced a related fill-in-the-blank test 
after listening to a passage, they performed 
more poorly on “the number of ideas” crite- 
rion than high dogmatic subjects who expe- 
rienced only unrelated tests. High dogmat- 
les who tend to rely on authority (Rokeach, 
1960), after having taken a structured and 
arbitrarily selective test, which implied an 
authority standard, may have been left 
without direction by the completely un- 
structured, self-dependent free-recall test 
where they were forced to set their own 
standards. However, in situations where 
they experienced no previous questions from 
an authority about the material (i.e., unre- 
lated test events) dependence on an author- 
ity’s requirements was not made salient. 

In summary, we speculate that note tak- 
ing and rehearsal function as learning aids 
which facilitate encoding. Test events in- 
erease the salience of certain ideas ex- 
pressed in a communication and may clar- 
ify the instructor’s expectations regarding 
the kind of transformations required. Re- 
view provides an opportunity for consoli- 
dating the information learned at a given 
level of transformation. All strategies pro- 
vide the student with standards by which 
he evaluates how his study plan, in the 
sense of “plan” as defined by Miller et al. 
(1960), is to be implemented and his prog- 
ress in implementing the plan. 

The reflection of student activities as 
consequences of these instructional strate- 
gies are assumed to take the form of such 
observable outcomes as approach or avoid- 
ance of situations, time spent at a task, or, 
as in the present study, number of words 
generated. These are general outcomes of 
activities which mediate other outcomes 
typically classified as performance changes, 
and should not be confused with the kind of 
outcomes typically associated with course 
objectives. It appears important that the 
distinctions between outcomes which reflect 
the attributes of mediating behaviors and 
outcomes which are the consequences of 
these behaviors should be maintained in 
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further experimentation if activities, instru- 
mental in learning, are to be understood. 
That the student does something (such as 
note taking, test taking, and so on) as a 
consequence of the instructional strategy is 
clear. Similarly, it is clear that these strate- 
gies affect achievement outcomes. The ways 
in which the student activities actually me- 
diate outcomes is less clear. What needs to 
be identified are the kinds of activities 
which affect what have been described else- 
where (Di Vesta, 1970) as Type I (associa- 
tive), Type II (conceptual), and Type III 
(inferential) transformations. The identifi- 
cation of activities that make certain simuli 
more effective than others seem reasonable 
objectives for further investigations of lis- 
tening behavior. Especially important are 
studies to determine the kinds of activities 
which produce different goal expectations. 
Above all, the relationships between these 
and specific instructional objectives need 
still to be determined. 
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The effects that the placement of additional equipment in preschool 
classrooms has on the cognitive and perceptual development of Negro 
preschool children were evaluated. One hundred and twenty-three sub- 
jects were randomized into six experimental and six control classes. 
Pretests and posttests of the Binet IQ, the Wechsler Preschool and Pri- 
mary Scale of Intelligence Performance IQ, and four subtests of the 
Illinois Test of Psycholinguistie Abilities were administered. Both de- 
sirable and undesirable effects resulted from the environmental enrich- 
ment. Perhaps certain claims about the cognitive and perceptual value 


of play materials ought to be reconsidered. 


The quantity and quality of the play ma- 
terials available to preschool children has 
long been considered important in their de- 
velopment (Isgacs, 1968 [first publication, 
1929]; Montessori, 1965 [first publication, 
1914]). Textbooks in early childhood (e.g., 
Leeper, Dales, Skipper, & Witherspoon, 
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1968; Read, 1966) usually stress the equip- 
ment and supplies available in preschool 
classrooms. 

Very little empirical research could be lo- 
cated that dealt with the relationship be- 
tween play materials and the cognitive and 
perceptual development of young children. 

Opinions about the value and effective- 
ness of play materials are abundant, but 
differ considerably. On the one extreme is a 
Creative Playthings’ ad (1969) which sug- 
gests that toys “expand the sensory, motor, 
and perceptual skills.” 

On the other hand, Bereiter and Engel- 
mann (1966) hold that “an object-rich en- 
vironment is ineffective in compensating 
for the child’s toy deficit and in stimulating 
learning [p. 72].” 

A number of authors (ENKI Corpora- 
tion, undated; Murphy, 1968; Olson & Lar- 
son, 1965; Ward, 1968) have argued that 
various materials produce differential de- 
velopmental gains. They suggest, for exam- 
ple, that some materials are likely to pro- 
duce gains in verbal ability, while other 
materials are most suited for encouraging 
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social development. The results of Upde- 
graff and Herbst’s study (1933) showing 
that sociable and cooperative behavior oc- 
curred more frequently during play with 
clay than with blocks empirically support 
this idea. 


Mernop 


Subjects 


Two Head Start classrooms in each of six areas 
of a large city were matched for physical facilities 
and equipment. Each classroom was located in a 
different Head Start center, but matched class- 
rooms were never more than three blocks apart. In 
so far as it was possible, only children living be- 
tween matched centers were selected. These Negro 
children were then separated by sex and area of 
the city and randomized into one of the matched 
classrooms. After the initial registration was com- 
pleted, one classroom from each pair was randomly 
selected and “enriched.” 

This procedure resulted in the following subject 
distribution: 36 enriched boys, 44 control boys, 42 
enriched girls, and 39 control girls. The unequal 
numbers between groups were caused both by the 
unstable nature of enrollment and by the use of 
paired classrooms as the randomization unit. 

Additional children who registered throughout 
the year were likewise separated by sex and area 
of the city and randomly assigned to enriched and 
control classes, These children were not included 
in the sample. 

, Throughout the year a number of subjects either 
withdrew from the program or could not be tested 
because of excessive absences. The following sub- 

jects were administered all of the cognitive and 
perceptual measures: 28 enriched boys, 31 control 
boys, 34 enriched girls, and 30 control girls. The 
enriched boys had a median age of 4 years 1 
month with a range of from 3 years 7 months to 4 
years 7 months at the beginning of the school year; 
the control boys had a median age of 4 years 2 
months with a range of from 3 years 9 months to 
4 years 6 months; the enriched girls had a median 
age of 4 years 3 months with a range of from 3 
years 8 months to 4 years 8 months; the control 
girls had a median age of 4 years 3 months with a 
range of from 3 years 7 months to 4 years 7 months. 


Teachers 


Teachers in paired classrooms were matched for 
Sex, race, and age. All were female and all had 
taught previously in the Head Start program. 
None of the teachers was officially certified to teach 
at the preschool level, although all held college 
degrees. Eight of the teachers were Negro and four 
were white. The median difference between the 
ages of paired teachers was 5 years; the range was 
from 1 to 7 years. Two teachers in enriched classes 
and one in a control class left during the year. 
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Their replacements were matched as before. In the 
two enriched classes, the replacement teachers 
served for most of the year. In the control clasg 
there was difficulty finding an appropriate substi- 
tute. For much of the year that class had a sue 
cession of substitutes. One teacher who served for 
3 months is treated as the teacher of this class. 

Each teacher worked with a teacher’s aide. All 
of these teacher’s aides were Negro females. They 
were not matched in any other way. 


Classrooms 


The enriched and control classrooms were lo- 
cated in church buildings. Five pairs of classrooms 
were in the heart of the Negro community, Two 
classrooms were on the fringe of that community, 
but these were attended almost exclusively by 
Negro children. 


Enrichment Procedures 


The subjects were randomized into paired class- 
rooms in September of 1968. One classroom of each 
pair was randomly assigned to the enriched condi- 
tion. Then a substantial amount of equipment and 
supplies was added to the six enriched classrooms, 
Each item placed in the enriched classroom was 
chosen specifically to augment one or more of the 
following: verbal ability, performance ability, vis- 
ual perception, auditory perception, and social in- 
teraction. The findings related to social interaction 
are discussed in Busse, Ree, and Gutride, 1970. 

A sample of the materials placed in the enriched 
classrooms included: a tape recorder, a Polaroid 
camera, book sets, rubber farm animals, sound 
cylinders, magnets, wooden puzzles, a shape-sorting 
box, prisms, rhythm band instruments, record sets, 
Negro dolls, Negro community workers (rubber 
figures), and Negro puppets. The listed cost of the 
materials for each enriched classroom totaled ap- 
proximately $1,300 

A number of suggested lists of equipment and 
supplies for the preschool classroom were studied 
before choosing the enrichment materials (Associa- 
tion for Childhood Education International, 1968; 
Evans, 1966; National Child Research Center, un- 
dated; Olson & Larson, 1965). An attempt was 
made to avoid duplication of equipment and sup- 
plies typically found in Head Start classrooms by 
taking an inventory in five classrooms in the ex- 
perimental area prior to the study. 

The equipment and supplies, except for a few 


“A three page table listing the supplementary 
equipment placed in each enriched classroom has 
been deposited with the American Documentation 
Institute. Order Document No. 01627 from ADI 
Auxiliary Publications Project, Photoduplication 
Service, Library of Congress, Washington, D. C. 
20540. Remit in advance $5.00 for photocopies of 
$2.00 for microfilm and make checks payable to: 
Chief, Photoduplication Service, Library of Con- 
gress. 
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back-ordered items, were placed in the enriched 
classrooms in late October, 1968. In addition, the 
teachers were supplied with flashbulbs, film, and 
tapes throughout the year. The effect of adding the 
equipment and supplies to the enriched classrooms 
was to take meagerly equipped classrooms and turn 
them into “dream” classrooms. 

No special training was given to the teachers in 
the enriched classes, except for the instructions in 
how to operate the tape recorder and camera. 

All the equipment in the 12 classrooms was in- 
ventoried by the experimenters at the end of June, 
1969. Most of the enrichment materials were still 
in the enriched classrooms at the end of the year. 
The superiority of the enriched classes in terms of 
play materials was evident. 


Cognitive and Perceptual Measures 


The Stanford-Binet IQ test (Terman & Merrill, 
1960), the five performance subtests of the Wechs- 
ler Preschool and Primary Scale of Intelligence 
(WPPSI; Wechsler, 1967), and four subtests of 
the Illinois Test of Psycholinguistic Abilities 
(ITPA; Kirk, McCarthy, & Kirk, 1968) were ad- 
ministered twice to the 123 subjects. Most of the 
pretests were administered during November, 1968; 
a majority of the posttests was given during May, 
1969. The mean time between pre- and posttesting 
was 24 weeks for the Binet, 28 weeks for the 
WPPSI, and 25 weeks for the ITPA subtests. 

The five performance subtests of the WPPSI 
given were animal house, picture completion, 
mazes, geometric design, and block design. 

The four subtests of the ITPA used were: (a) 
“visual reception,” in which the examiner exposes 
printed stimulus and then asks the subject to 
find it among three others printed on a separate 
page; (b) “visual sequential memory,” in which 
the examiner exposes a picture showing a particu- 
lar ordering of geometric items that the subject 
then has to reproduce with a set of chips imprinted 
with the same geometric shapes; (c) “auditory re- 
ception,” in which the examiner asks the subjects 
to respond “yes” or “no” to items such as “Do 
boys play?” and “Do dresses sing?"; (d) auditory 
sequential memory,” in which the subject is asked 
to repeat a series of digits that has been read to 
him at ¥2-second intervals. 

These tests were chosen to evaluate develop- 
mental gains in verbal ability (Binet), performance 
ability (WPPSI performance subtests), visual per- 
ception (visual reception and visual sequential 
memory), and auditory perception (auditory re- 
ception and auditory sequential memory) for the 
enriched and control classes. 


Teacher Behavior Instruments 


Two different facets of teacher behavior were 
studied. First, an interaction measure of the teach- 
ers’ encouragement of the use of equipment was 
obtained. Second, the teachers were ranked as to 
their effectiveness in fostering cognitive and per- 
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ceptual learning in their children. The teachers 
were not aware that they were being observed; 
they thought that the observers were recording 
only the children’s behavior. 

Each teacher was observed for six 30-minute pe- 
riods on a random basis from January to June, 
19695 Each teacher was observed during the same 
six time periods (e.g. 9:00-9:30) but on different 
mornings. Every 30 seconds the recorder checked 
off, as present or absent, teacher encouragement of 
the use of equipment. 

Teacher encouragement was considered to en- 
compass the following specific behaviors: 

1. Exhortation toward the use of equipment and 
supplies. For example, *Let's all play with crayons 
now." 

2. Physical assistance in the use of equipment 
and supplies. Most probably individually directed 
(eg., helps child with ruler; moves paint brush as 
the child holds it). 

3. Instruction about equipment and supplies 
(purely descriptive). Teacher must endeavor to in- 
volve the child with described equipment. For ex- 
ample, “This is a ruler, It is used to measure 
things.” or “This is a map of Pennsylvania. Here 
is where we live.” 

4. Instructions about methods of use. Teacher 
must try to involve the child with equipment. For 
example, “You fill up the can with water, then you 
dip the paint brush in.” or “You fold your paper 
in half like this.” 

5. Questions leading to the use of equipment and 
supplies. For example, “Can anybody make an air- 
plane with this paper?” 

The total number of periods out of 360 during 
which the use of equipment and supplies was en- 
couraged is the teacher’s score on this measure. 

For reliability purposes, 12 different teachers 
were each observed on the same occasion by two 
different raters for 60 30-second periods, The per- 
centage of periods in which two raters agreed on 
the scoring of encouragement as being either pres- 
ent or absent was 94.4%, 

The second teacher behavior measure was & 
ranking of the teachers in terms of effectiveness in 
fostering cognitive and perceptual learning in thei 
children, Where more than one teacher was in 
charge during the year, the teacher who was there 
longest was used in this ranking, The ranking was 
done once, after the close of the data collection by 
two observers who spent much of the school year 
in those teachers’ classes, The Spearman rank-order 
correlation between the rankings of the two ob- 
servers was 85 (p « .01, for a one-tailed test); 
since there was communication between observers 
throughout the year, this correlation must be taken 


5In one control center, five observations were 
done on several different teachers, each of whom 
taught for a time. A sixth observation was not done 
because of teacher absence; the mean of the other 
observation periods was substituted. 
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lightly. The mean of the two observers’ rankings 
was used as a teacher's rank in the analyses. 


RESULTS 
Analyses of Covariance 


formance IQ, and the visual reception, vis- 
ual sequential memory, auditory reception, 
and auditory sequential memory subtests of 
the ITPA are the measures analyzed here. 

The six posttest variables were separately 
analyzed using a 2 (treatments; enriched 
and normal classrooms) X 6 (blocks) anal- 
ysis of covariance design with the appropri- 
ate pretest as a covariate in each analysis. 
Treatments was considered to be a fixed 
factor; blocks, to be random. The unit of 
analysis chosen was class means rather than 
individual subjects (Peckham, Glass, & 
Hopkins, 1969) .8 

The number of subjects in each cell is 
shown in Table 1. The means of the pretest 
and posttest scores are presented in Table 2. 
The product-moment intercorrelations of 
the pretest and posttest variables are shown 
in Table 3. 
_ The analyses of covariance are presented 
in Table 4. Significant treatment effects 
were found for the WPPSI Performance IQ 
(p < .06), visual reception (p < .02), and 
visual sequential memory (p < .05). 

Studying the means of the pretest and 
posttest scores presented in Table 2, it is 
evident that children in the control classes 
gained significantly more than children in 
the enriched classes in both WPPSI Per- 
formance IQ and visual reception. On the 
other hand, children in the enriched classes 
gained significantly more in visual sequen- 
tial memory. 


° Analyses using either individual subjects or 
class means would seem to be reasonable in the 
present study because intact classes were not in- 
volved. Class means were used in the analyses pre- 
sented here since this procedure is the more con- 
servative of the two (ie., less likely to produce a 
Type I error). However, separate analyses using 
individual subjects and including sex as a factor 
were done. The significant results are the same as 
those found in the analyses using class means, al- 
though the magnitude of the significance is greater 
when using individual subjects. None of the Sex x 
Treatment interactions was significant in the analy- 
ses using individual subjects. 
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TABLE 1 
COMPOSITION OF THE SAMPLE 
Blocks 
Subjects Total 
1 [/2^ Va a 1,4 5 6 
Enriched 10 | 14| 8|10| 12 8 62 
Control 9 | 13 | 10 |10| 11 8 | 6l 


Teacher Encouragement of the 
Use of Equipment 

The median number of 30-second periods 
out of 360 during which teachers in the en- 
riched classes encouraged the use of equip- 
ment was 51.50 with a range of from 23 to 
67. For control teachers, the median was 
51.00 with a range of from 17 to 62. The 
difference is not significant (Mann-Whitney 
U = 16.5, p > .10). 

Spearman rank-order correlations were 
computed between the frequency of a teach- 
er’s encouragement and the mean residual 
gain scores for the six cognitive and percep- 
tual variables in the teacher’s class. None of 
the gain scores by classes was significantly 
related to a teacher’s encouragement of the 
use of equipment in her class. Nor were 
there any significant relationships when 
these correlations were computed separately 
for teachers in the enriched and control 
classes. Thus there was no interaction effect 
between the amount of teacher encourage- 
ment and whether or not a classroom was 
enriched. 


Teacher Effectiveness 


Each teacher was ranked according to her 
effectiveness in fostering cognitive and per- 
ceptual learning in children. The observers 
did not make these ratings using specific 


TABLE 2 

MEANS or COGNITIVE AND PERCEPTUAL VARIABLES 

Enriched Control 

subjects subjects 
Tests mc 
Pretest |Posttest| Pretest |Posttest 
BASS 
Binet I 94.56 | 102.15 | 94.75 | 100.72 
WEPSI Performance IQ 91.05 | 95.48 | 88.97 | 96.59 
Visual reception 36.06 | 36.10 | 37.48 | 39.25 
Visual sequential memory | 33.68 | 37.02 | 33.54 | 34.08 
Auditory reception 32.89 | 34.29 | 32.28 | 33-72 
Auditory sequential memory} 42.50 | 42.56 | 40.77 | 41.1. 
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TABLE 3 
INTERCORRELATIONS BETWEEN PRETEST AND POSTTEST VARIABLES USING CLASS 
Means AS THE UNIT OF ÁNALYSIS* 
Tests "EDRESEEBDHEORESESE 
1. Binet IQ pretest — 
2. WPPSI Performance IQ pretest | .84 — 
3. Visual reception pretest :851..7241/— 
4. Visual sequential memory pre- 
test .27. 15.80. — 
5. Auditory reception pretest .07 .51 .34 58 = 
6. Auditory sequential memory pre- 
test .16 .53 .35 —.46 —.22 — 
7. Binet IQ posttest .92 .89 .60 .14 BY PY Wile cars 
8. WPPSI Performance IQ posttest | .70 .83 .91 .25 i93 41 he 
9. Visual reception posttest .11 .13 .57 —.18 —.23 .26 .13 .49 — 
10. Visual sequential memory post- 
test .52 .65 .42 .22 .46  .40 .59 .54 .19 E 
11. Auditory reception posttest .26 .44 .16  .06 25. .28 .81 .21 —.45 —.07  — 
12. Auditory sequential memory 
posttest .31 .59 .59 .02  .14 .04 .86 .47 .37  .70 .10 
an = 12. 


behavioral criteria. Rather they were in- 
structed to use their clinical judgment. 
Mean rankings of the two observers yielded 
an overall effectiveness rating for each 
teacher. The mean rank for the six teachers 
in enriched classes was 6.83 and for the six 


control teachers was 6.17. Thus the control 
teachers were judged as slightly more effec- 
tive; however, the difference is not signifi- 
cant (Mann-Whitney U = 14.5, p > 10). 
Teacher effectiveness ratings had a Spear- 
man rank-order correlation of .51 with 


TABLE 4 


ANALYSES OF COVARIANCE or POSTTEST SCORES OF COGNITIVE AND PERCEPTUAL VARIABLES USING 
PRETEST Scores AS SINGLE COVARIATES 


Variable Source df MS Fe 4 

BERI ORO eS a pee M AC 

Binet IQ Treatment (T) 1 8.08 1.47 
Blocks (B) 5 5.59 
TXB 4 5.48 

WPPSI Performance IQ T 1 24.61 7.32* 
B 5 7.39 
TXB 4 3.36 

Visual reception T 1 22.06 16.00*** 
B 5 7.36 
TXB 4 1.42 

Visual sequential memory T 1 31.14 8.82** 
B 5 14.32 
TXB 4 3.58 

Auditory reception T 1 .85 .27 
B 5 1.82 
TXB 4 dee dog 

Auditory sequential memo; p 1 3.79 4 

n is idi B 5 6.90 
TXB 4 1.06 
a The mean square for the Treatment X Blocks interaction is used to test the treatment effect. 
*p < .06. 
** p< .05. 


"p < 02. 
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teacher encouragement of the use of equip- 
ment (p « .05, for a one-tailed test). 

Spearman rank-order correlations were 
computed between the teacher effectiveness 
rankings and the mean residual gain scores 
in the classes for the six cognitive and per- 
ceptual variables. None of the correlations 
was significant. Likewise, when these corre- 
lations were computed separately for teach- 
ers in enriched and control classes, none was 
significant. 


Discussion 


Taken as a whole, the findings show that 
the enrichment significantly altered the 
classroom environment. Signs of the altera- 
tion were present in both the cognitive and 
perceptual development of the children. 

Play materials and the related equipment 
placed in the enriched classes were specifi- 
cally chosen for their hypothesized ability 
to produce gains in verbal ability, perform- 
ance ability, visual perception, and audi- 
tory perception. Specific measures were in- 
cluded to evaluate the effects of the en- 
richment on each of these. 

No differences between enriched and con- 
trol children were evident in verbal ability 
or auditory perception. However, the con- 
trol children showed significantly greater 
gains in performance ability than did the 
enriched children. The differences in gains 
between enriched and control children are 
mixed for visual perception. Gains in visual 
reception were significantly greater for the 
control children, but gains in visual sequen- 
tial memory were greater for the enriched 
group. 

One hypothesis which might account for 
the greater gains of the control groups in 
performance ability and visual reception 
suggests that teachers in the enriched 
classes took advantage of the materials by, 
for example, having less interaction with 
their children. Another hypothesis reasons 
that the control teachers attempted to 
“show” the experimenters. That is, the con- 
trol teachers might have been motivated to 
work harder because of their participation 
in an experiment in which they perhaps pic- 
tured themselves as “underdogs.” The find- 
ings from the two measures of teacher be- 
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havior argue against both of these explana- 
tions. Neither an interaction measure of the 
frequency of the teacher's encouragement of 
the use of materials, nor an overall rating 
of the teacher's effectiveness were signifi- 
cantly related to mean class gains of any of 
the cognitive or perceptual variables. Like- 
wise, there were no differences between 
teachers in the enriched and control classes 
on these measures. However, additional 
teacher behavior variables not measured in 
this study might have yielded relationships 
with the class gain scores (e.g., Linn, 1966), 

Another possible explanation of the re- 
sults suggests that children in the enriched 
classes might not have used the play mate- 
rials extensively. But, as reported in a pre- 
vious paper (Busse et al., 1970), data were 
collected showing that the enriched children 
spent more time than control children coop- 
eratively playing with toys. These findings 
suggest that play materials were indeed 
used by the enriched children, rather than 
being left sitting on the playroom shelves. 
Thus, since teachers in the enriched and 
control classes did not differ in their fre- 
quency of encouraging the use of play ma- 
terials, it seems likely that the children 
were spontaneously attracted by the play 
materials without extraordinary teacher in- 
tervention, 

Several authors (Caldwell & Richmond, 
1968; Gray & Klaus, 1965; Thompson, 
1944) have suggested that the way play 
materials are used may determine their 
effectiveness, Thus, since none of the teach- 
ers in this study were certified to teach at 
the preschool level, it can be argued that if 
teachers in the enriched classes had a better 
knowledge of how to use the enrichment 
materials, the enriched children would have 
shown greater gains than the control chil- 
dren on all or most of the dependent varia- 
bles. This argument rests on the assumption 
that knowledge of how to use play materi- 
als does exist; but a search of the psycho- 
logical literature turned up no empirical 
data on this question. Moreover, from the 
standpoint of the usefulness and applicabil- 
ity of the findings of this study to other 
Head Start programs, it seemed best not t0 
introduce complex instructions in the use 0 
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materials, instructions which could not eas- 
ily be applied in the non-research setting. 

It seems that the most probable reason 
for the findings concerning cognitive and 
perceptual development remains the play 
materials themselves. There can be too 
much of a good thing. 

The results suggest that both desirable 
and undesirable effects can be expected 
from environmental enrichment. At the 
very least, the more extravagant claims for 
the efficacy of certain play materials ought 
to be muted. A “properly” equipped pre- 
school classroom is apparently not a pana- 
cea for the problems of disadvantaged chil- 
dren. 
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DEVELOPMENTAL STUDY OF THE ACQUISITION AND 
UTILIZATION OF CONCEPTUAL STRATEGIES' 


JAMES D. MCKINNEY* 
North Carolina State University 


This experiment tested the effects of instruction in two formally dif- 
ferent strategies on conjunctive concept attainment and problem- 
solving efficiency at two developmental levels. The subjects were 90 
educable retarded children with MAs of 5-6 and 7-8 years. The treat- 
ments consisted of instruction in conservative focusing, successive scan- 
ning, or no instruction. At the 5-7 MA level, training in focusing failed 
to facilitate performance ; instruction increased problem-solv- 
ing efficiency but did not facilitate concept attainment, At the 7-8 MA 
level, instruction in both strategies facilitated performance; however, 
focusing was more effective than scanning. Thus the acquisition of 
complex cognitive operations in young children was a function of both 


the logical structure and informational demands of the task. 


One of the more consistent findings in 
studies of concept learning is that individu- 
als tend to process information according to 
some systematic plan or strategy (Bourne, 
1966; Bruner, Goodnow, & Austin, 1956; 
Miller, Galanter, & Pribram, 1960) and 
that performance is facilitated when a 
strategy is provided for the subject (Klaus- 
meier & Meinke, 1968; Wells & Watson, 
1965). Several recent studies have shown 
that young children are unable to solve a 
series of concept attainment problems even 
when the number of trials to criterion is 
unlimited (Stern, 1965; Wittrock, 1964). 
One investigator noted that the children in 
these studies were unable to generate prob- 
lem-solving techniques and frequently 
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ceased attending to the stimuli or engaged 
in random behavior (Stern, 1967). 

From Piaget’s theory (Inhelder & Piaget, 
1958), it may be argued that a child must 


reach the period of formal operations (11 | 


years) before he can use a logical strategy 
for hypothesis testing (Anderson, 1965; 
Stern, 1965; Yudin & Kates, 1963). The ac- 
quisition of such processes is described by 
the progressive internalization of the rules 
of logic which emerge in an invariant se- 
quence of stages. This description of devel- 
opment suggests that certain stage-defined 
cognitive processes, for example, transitiv- 
ity and reversibility, must be present before 
learning can effectively contribute to con- 
ceptual development. On the other hand, if 
one assumes a cumulative learning model 
for conceptual development (Gagné, 1968), 
it might be argued that young children have 
difficulty in using strategies because they 
have not been taught the requisite “learning 
sets.” According to this position, develop- 
ment proceeds in a continuous, cumulative 
fashion and involves the learning of in- 
creasingly more complex hierarchies of re- 
sponse capabilities. - 
Previous research has shown that chil- 
dren can be taught strategies for concept 
attainment and that they are capable of 
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transferring such strategies to new prob- 
lems (Anderson, 1965, 1968; Stern, 1967; 
Stern & Keislar, 1965; Wittrock, 1967). The 
efficiency of strategy utilization has been 
shown to vary with age when subjects are 
untrained in a specific strategy (Olson, 
1966; Tagatz, 1967; Yudin & Kates, 1963), 
and type of strategy when subjects are given 
instructional strategies (Stern, 1967; Stern 
& Keislar, 1965; Wittrock, 1967). The re- 
sults of several studies support the hypothe- 
sis that formal operations as described by 
Piaget are necessary for effective strategy 
utilization (Tagatz, 1967; Yudin & Kates, 
1963) ; however, since the subjects were not 
given extensive training, the cumulative 
learning position was not tested. Similarly, 
Anderson’s (1965, 1968) finding that first 
graders could acquire a conservative focus- 
ing strategy seems to support the cumula- 
tive learning position; however, since type 
of strategy and age were not varied, the 
possibility of an Age X Type of Strategy 
interaction was not tested. Such an effect 
was suggested in two experiments in which 
it was found that young children were una- 
ble to acquire a “complex” strategy but 
were able to use a simpler strategy (Stern, 
1967; Stern & Keislar, 1965). 

Nevertheless, it is not clear from these 
studies whether the age differences that 
were observed were due to the kind of cog- 
nitive operations required by the logical 
structure of the strategy, a result critical to 
the Piagetian position, or to the number of 
such operations that were performed, a re- 
sult favoring the more traditional learning 
models of development. The single versus 
multiple hypothesis-testing strategies de- 
scribed by Stern (1967) and by Stern and 
Keislar (1965) seem to differ only in the 
initial hypothesis the child is assumed to 
entertain and not in the operations he per- 
forms over the trials. In fact, Restle (1962) 
has shown that both single and multiple hy- 
pothesis-testing strategies lead to the same 
expectations in the data. Accordingly, the 
issue of whether age differences in concep- 
tual behavior may be attributed to differ- 
ences in the formal structure of the sub- 
ject’s strategy (a cognitive complexity di- 
mension) or to differences in the task re- 


quirements of the strategy (a task complex- 
ity dimension) is unresolved. A more ade- 
quate test of the Piagetian position would 
compare problem-solving efficiency at sev- 
eral ages under two structurally different 
strategies, one presumably involving more 
advanced logical operations than the other, 
with task difficulty held constant. 

Two strategies which seem to meet this 
requirement are the conservative focusing 
and successive scanning strategies origi- 
nally observed by Bruner et al. (1956). In 
using the focusing strategy, the child must 
vary each attribute in succession while 
holding all others constant and must apply 
an indirect decision rule for hypothesis test- 
ing, that is, a positive instance indicates an 
irrelevant attribute. As Anderson (1965) 
pointed out, according to Piaget's theory, 
these operations are not developed until 
12-14 years of age. On the other hand, 
scanning requires only one identity trans- 
formation per trial, and hypotheses are 
tested directly, that is, a positive instance 
denotes a relevant attribute. 

From Piaget’s description of concrete op- 
erations, it might be hypothesized that chil- 
dren with mental ages of 7-8 years would 
perform well with the scanning strategy but 
would be unable to acquire focusing. Simi- 
larly, children with mental ages below 7 
years (preoperational thought) should be 
unable to acquire either strategy. If one as- 
sumes a cumulative learning model, then 
the formal structure of the strategy is irrel- 
evant and the relative difficulty inherent in 
the learning tasks per se can be hypothe- 
sized to generate age orderings. Accord- 
ingly, one should observe no differences be- 
tween the two strategies as a function of 
age when memory load is held constant. 


Mernop 


Design and Subjects 


The experimental design was a 2 X 3 factorial 
with two levels of mental age and three treat- 
ments. The two mental age levels were 5 and 6 
years (L) and 7 and 8 years (H). The three treat- 
ments consisted of two instructional procedures: 
subjects receiving training in the conservative fo- 
cusing strategy (F); those receiving training in the 
successive scanning strategy (S); and a control 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS FOR 
CHRONOLOGICAL ÅGE, MENTAL AGE, 
AND IQ rog Each Group 


Low MA High MA 
Variable 
F sS c F S c 

CA 

x 129.53/137 54/137 .33/145.80/140.60/136.87 
SD 14.64| 16.51| 15.54| 13.82| 10.26| 12.90 
MA 

X 72.47| 74.69| 73.40| 92.87| 96.00) 94.13 
8D 8.41| 7.16| 6.08| 7.51| 5.25| 8.08 
I 

% 62.00] 61.23] 59.60] 71.40} 73.80] 71.73 
SD 7.62, 6.56] 6.68} 5.59} 3.01] 4.39 


Note.—Abbreviations: F = subjects receiving 
training in conservative focusing strategy; S = 
subjects receiving training in the successive 
scanning strategy; C = a control condition in 
which subjects did not receive strategy instruc- 
tion. 


condition in which the subjects did not receive 
strategy instruction (C). 

The subjects were 90 educable retarded children 
who attended special education classes in the Ra- 
leigh, North Carolina, public schools. Four Phase 
IL (CA, 9-11 years) and three Phase III (CA 11- 
18 years) classes were used. Initially one class from 
each phase was assigned to a specific treatment 
group. All subjects were drawn from a noninsti- 
tutionalized population currently living with their 
parents and without previous institutional experi- 
ence. The subjects were judged to be predomi- 
oio a to ieu class in E inis level. 

le sample was composed of approxima: 
half Negro and half white children. All rond 
were given the Wechsler Intelligence Scale for 
Children, Verbal Scale. The subjects were then as- 
signed to MA levels, and the treatment groups 
within each level were matched on MA by pairing 
percer Hs ed sex who were within 3 months 
on. - The initial subject pool was composed of 
105 children. In accordance with the matching pro- 
on. 15 gd were selected for each experi- 
mental group. e sample was composed of 
males and 45 females. 2 

The means and standard deviations of the chron- 
ological Ages, mental ages, and IQs for each group 
are given in Table 1. An analysis of variance on 
MA for each level showed no significant differences 
at the .05 level for either the 5-6 year group (F — 
321, df — 2/40) or the 7-8 year group (F — 748, 
af = 2/42). An analysis of variance on chronologi- 
cal age indicated that the high-MA group was sig- 
nificantly older than the low-MA group (F = 4555, 
df = 1/82, p < 05); however, no differences were 
observed within MA levels (F — .151, dj — 2/82). 

A similar analysis for pretest IQ showed a sig- 
nificant MA levels effect (F = 85.592, df = 1/81, 


p < .005) with the high-MA group superior to the 
low-MA group, and a nonsignificant treatment 
group effect (F = .744, df = 1/82). It was con. 
eluded that the selection procedure had failed to 
match all groups adequately on Verbal IQ, and that 
the MA levels were partially confounded by IQ, 


Experimental Problems 


Problem 1: Flowers. The stimuli were 27 flowers 
that were shown on 4'4-inch-square cards. The 
stimulus dimensions were size (2, 234, and 394 
inches in diameter), color of petals (red, orange, 
and blue), and number of petals (four, six, and 
eight). The subjects were instructed to discover 
what kind of flowers grew in the examiner's garden, 

Problem 2: Blocks, Twenty-seven plywood 
blocks were used. The stimuli varied according to 
shape (star, cross, and oval), color (red, blue, and 
green) and context element, (the numbers 1, 2, and 
3). The subjects were asked to find out the exami- 
ner’s secret about the blocks. 

Problem 3; Animals. The stimuli were pictures 
of dogs, cats, and horses that were drawn on 5 X & 
inch cards. The animals were either dark, light, or 
spotted, and were either lying, sitting, or standing, 
The subjects were instructed to find out what kind 
of animals the zookeeper wanted for the zoo. 

A standardized procedure, similar to that em- 
ployed by Anderson (1965), was devised which 
seemed suitable for the age range of the subjects 
that were used in the experiment. The instructions 
used familiar words that roughly translated into 
the key procedures of the selection paradigm. For 
example, "secret" was substituted for “concept,” 
“pick” for “select,” and “find out” for “discover.” 

The subject first underwent a task familiariza- 
tion procedure in which he was shown the stimu- 
lus display, and the dimensions and values were 
delineated. The experimenter introduced the B 
as a game and began by naming each value it 
turn and by having the child point to all the in- 
stances that showed that value. After all three val- 
ues for a dimension were delineated, the experi- 
menter named the dimension. This procedure wa 
followed by another set of three values; however, 
on the third set, the subject was asked to name 
the dimension and its values. The experimenter rè- 
viewed the dimensions and values of each 
corrected any errors that were made in labeling. 

The experimenter then denoted the idea of the 
game (eg., to find out what kind of flowers grew in 
his garden) and named a conjunctive concept fot 
which the subject pointed to all the exemplars. 
The experimenter then explained the selection pro: 
cedure. The basic set of instructions were contain 
in the following example: | 

_ Each time we play the game I will give you ? 

picture of a flower that is like the flowers in MY | 

garden. You will pick other flowers you see here. 

Each time you pick a flower, I will tell you 

whether or not it grows in my garden. If it does 

grow in my garden, you will put it under the 

“yes” sign here. If it does not grow in my gat- 

den, you will put it under the “no” sign hel 
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Tell me as soon as you know which kind of 
flower grows in my garden. 
After each trial the experimenter said, ^Yes (or 
no) that is (not) like the flowers in my garden." 
All subjects were given one demonstration con- 
cept followed by two conjunctive test concepts. 
The subjects were allowed to select without re- 
placement a maximum of 5 instances on demon- 
stration concepts and 12 instances on test con- 
cepts. After three trials the experimenter prompted 
the verbalization of the subject’s guesses by using 
a standard list of prompts, but offered no other 
evaluative comments. If the subject correctly ver- 
balized the solution, he was asked to continue un- 
til all positive instances had been selected and un- 
til he stated that none remained in the display. 
Each instance was scored as either informative 
or unnecessary. A trial was scored as informative 
if the instance chosen by the subject yielded neces- 
sary and new information; as noninformative, 
when the subject’s instance selection conveyed no 
information regarding the concept. Presumably 
such errors reflect an absence of hypothesis-testing 
behavior and hence provide a measure of the ex- 
tent to which the subject can evaluate information 
regarding the concept. An equivalence error was 
scored if the subject’s selection exactly reproduced 
the information provided on a previous trial. It 
was assumed that this type of error was due to 
storage failure and that it constituted an attempt 
to regain the same information that was lost on 
a previous trial. A redundant error was scored when 
a selection yielded information provided by a pre- 
vious instance but did not exactly reproduce the 
same information. Presumably this type of error 
also results from memory failure; however, it is 
assumed that the information loss is partial. The 
number of unnecessary trials was computed as the 
sum of the noninformative, equivalence, and re- 
dundant errors. 


Training Procedure 


Previous research on the acquisition of a focus- 
ing strategy by children has shown that a small- 
step, programmed, part-task method is superior to 
a whole-task procedure in which subjects attempt 
terminal performance early in training (Anderson, 
1968). Consequently, the training procedure was 
divided into segments, each of which contained a 
specific component skill in the terminal task. The 
general progression was cumulative in that the 
subjects acquired and practiced the lowest skill in 
the hierarchy, after which they acquired the next 
skill in the hierarchy and practiced it with all pre- 
vious skills. t ; 

The training stimuli were two three-dimensional 
problems which were reproduced in color on 11 X 
13 inch white paper. The first problem was Flow- 
ers; the second, Shapes. The Shapes problem con- 
sisted of 27 shapes which varied according to form, 
letter, and number. The shapes were squares, cir- 
cles, and triangles which contained the letters A, 
B, or C and the numbers 4, 5, or 6 as context ele- 
ments. The problems were presented on an easel at 


the front of the room, and each subject marked 
his page with colored pencils. A daily record was 
kept of each subject’s performance. Also, subjects 
were exposed to the same problems with the stim- 
uli represented on 3¥2-inch squares that were ar- 
ranged on a magnet board beside the words yes 
and no. Thus, the experimenter could simulate 
the actual problem by moving the stimuli on the 
magnet board while the subjects marked their 
workbook pages. Finally the formal rules for each 
strategy were shown on a large cardboard sheet at 
the front of the room. These devices were included 
to provide a mechanism for feedback and to reduce 
memory load during instruction. 

The entire training program was divided into 
two major parts, task familiarization and strategy 
instruction. All subjects received the same task 
familiarization; however, two different programs 
were devised for the focusing and scanning strate- 
gies. The sequence of training and component skills 
was the same for both strategies; however, the two 
programs differed in the formal rules and instance 
selection procedures that were taught. 

In the task familiarization phase of training, the 
subjects were taught to delineate the dimensions 
and values involved in each problem, to mark all 
the exemplars of several simple and conjunctive 
concepts, to name all the simple and conjunctive 
concepts in a single focus instance, and to record 
all the possible concepts in a focus instance at the 
bottom of their workbook pages. 

The children were then exposed to the rules for 
their respective strategies and were told that they 
would learn each rule in turn. They were first asked 
to generate all the possible concepts present in a 
particular focus instance. The subjects who were 
taught focusing were instructed in a hypothesis 
formulation procedure in which they were to as- 
sume one value of the focus instance as a tentative 
hypothesis on each trial. Scanners were taught first 
to assume simple concepts and then conjunctive 
concepts as hypotheses. The hypotheses were writ- 
ten on the board in symbolic form. Focusers were 
then taught to select instances which varied one 
value in turn while holding other values constant. 
Scanners were taught to select instances which ex- 
emplified their hypotheses and no others. Focusers 
were taught that if the changed value yielded a 
yes, then the value was irrelevant, and they were 
allowed to mark out all concepts implied by the 
value. If the changed value yielded a no, then the 
focusers learned that the value was relevant, and 
they were allowed to mark out all concepts not 
implied by that value. Scanners were taught that 
if the instance selected yielded a yes, then their 
hypothesis was the concept. If not, then their hy- 
pothesis was incorrect, and they learned to mark 
it out at the bottom of their workbook page. 

This basic procedure was repeated many times 
with both simple and conjunctive concepts. At the 
end of each trial the subjects were taught to re- 
call all relevant remaining hypotheses and to re- 
peat the procedure if necessary. The process was 
continued until all the possible concepts except one 
had been eliminated. The subjects then named the 
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concept. After they had reached Sigel edna 


Experimental Procedure 


All subjects were pretested with the Flowers 
problem approximately 1 week before training. 
Each subject was given one demonstration task 
and two conjunctive test concepts according to the 
procedure described above. Two classes of children, 
one from each phase level, were given training in 
conservative focusing, and two similar classes were 
given training in successive scanning. Matching 
control subjects were obtained from one Phase II 
and two Phase III classes. The control subjects did 
not receive special instruction and participated in 
their usual classroom activities during the training 
interval. 

Training was carried out by two experimenters 
in 30-45 minute sessions each day. The program se- 
quence required approximately 20 days with the 
Phase IT groups and 15 days with the Phase III 
The training was given in the subject’s 
classroom on a group basis. The class size ranged 
from 15 to 18 children. One experimenter led the 
exercise while the second provided individual sup- 
port and gave corrective feedback. Feedback was 
given on both a group and individual basis and 
was usually immediate. Positive reinforcement was 
given verbally and applied liberally. Frequent 
prompting of Tesponses was employed as a device 
to facilitate responding, to maintain rapport, and 
to encourage attempts with difficult tasks. 

, Immediately following training, all subjects were 
given Flowers, Blocks, and Animals in counter- 
balanced order. Each subject received one demon- 
stration concept and two conjunctive test concepts 
from each problem. 


RESULTS 


A preliminary analysis of sex differences 
for pretest and posttest unnecessary trials, 
type of error, and number of solutions 
failed to show any significant differences 
bees males and females as assessed by t 

B. 


Initial Performance 


The means and standard deviations for 
pretest unnecessary trials are given in 
Table 2. The analysis of variance on pretest 
unnecessary trials yielded a significant MA 
effect (F = 6.513, df = 1/82, p < 25); 
however, neither the treatment groups effect, 
(F = .903, df = 2/82, p > .05) nor the 
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TABLE 2 


Means AND STANDARD DEVIATIONS For Ej 
Prerest VARIABLE FOR EACH GROUP — 


Variable 
Unnecessary trials 
x -13| 15.69) 
SD 99) 4.46 
lundant trials 
5.00) 4.00) 
SD 1.00| 1.63) 
valence trials 
11.00| 9.38) 
x NE 1.12| 3.28) 
ve 
s aR: 1.68) 2.31 
SD 1.18| 2.32 


Note.—Abbreviations are the same as in Table 1. 


interaction (F = .926, df = 2/82, p > 
approached significance. Therefore, | 
though the initial problem-solving effici 
of the high-MA groups was superior to 
of the low-MA groups, the treatment groi 
within each MA level were matched oni 
tial efficiency. i 

As Table 2 shows, the most frequent el 
in all groups was of the equivalence 
The mean number of errors was consi 
ently higher for the low-MA groups as c0 
pared to the high-MA groups; howey 
Table 2 fails to show marked differen 
between the treatment groups within 
MA level. The multivariate analysis of V 
iance (Clyde, Cramer, & Sherin, 1966) 
formed on pretest redundant, equivalen 
and noninformative errors is presented. 
Table 3. 

The discriminant function for the MA’ 


TABLE 3 

Summary or THE MULTIVARIATE ANALYSIS. 
VARIANCE ON PRETEST REDUNDANT, EQU 
LENCE, AND NONINFORMATIVE ERRORS 


MA R | .34| . «80 
E | 54.44) 6.95 | . 02 
NI | 5.38) 1.84 | .. E 
Treatment | R 7.43) 3.62 | . 54] 1.02) 
E pun d p —.10 
E E E ,B4| —.14 
Interaction | R ,88| 43 | E 
E 7.81| .99 4 
NI | 1.24) .43|. E 


pr Voter Abbreviations: R = redundant; E = equi’ 
= noninformative. 
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fect for redundant, equivalence, and nonin- 
formative errors was found to be significant 
(F = 2.891, df = 3/82, p < .04) by the 
Wilks Lambda Criterion (Cooley & Lohnes, 
1962). The standardized discriminant func- 
tion coefficients of each type of error, to- 
gether with the significant mean differences 
between the high- and low-MA groups with 
respect to equivalence errors, supported the 
conclusion that the overall differences in 
problem-solving efficiency between the two 
MA groups could be attributed, in part, to 
differences between the high- and low-MA 
levels in the frequency with which each 
committed equivalence errors. 

The treatment groups effect yielded two 
discriminant functions. The Wilks Lambda 
test of roots 1 through 2 was significant at 
the .007 level (F = 3.063, df = 6/164), and 
the second root was significant at the .03 
level (F — 3.652, df — 2/82). The univariate 
tests of the treatment groups effect, showed 
significant mean differences between the 
treatment designations for redundant and 
noninformative errors. The treatment effect 
for equivalence errors was not significant. 
Therefore, although the treatment groups 
within each MA level were adequately 
matched on total unnecessary trials (see 
Table 2), complete matching with respect 
to type of error was not achieved. The mul- 
tivariate analysis for the MA X Treatment 
Groups interaction also removed two dis- 
criminant functions, neither of which was 
significant at the .05 level. 


Unnecessary Trials 


The means and standard deviations for 
posttest unnecessary trials on each problem 
for each group are shown in Table 4. 

The main effect for MA was significant 
at the .0005 level (F = 21.19, df = 1/82), 
and the treatment main effect, at the .01 
level (F = 5.26, df = 2/82). Similarly, the 
MA x Treatment interaction was signifi- 
cant at the .025 level (F = 4.38, df = 2/ 
82), The within-subjects tasks effect failed 
to approach significance. Also, none of the 
Tasks x Groups interactions was signifi- 
cant. 

Tn order to test further the MA x Treat- 
ments interaction, individual comparisons 
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TABLE 4 
MEANS AND STANDARD DEVIATIONS FOR POSTTEST 
Unnecessary TRIALS 


———————————E 


Low MA High MA 
Task 
F s c F s c 
Flowers 
14.60| 12.61| 16.73| 8.53| 10.80| 14.27 
SD 6.10| 6.45| 4.11) 4.72| 0.12| 5.36 
Blocks 
x 15.73) 12.15) 13.93} 7.40] 9.73) 11,80 
SD 4.45] 5.80} 6.02| 4.52) 7.00) 5.89 
Animals 
x 16.20] 12.00] 15.80} 5.73) 8.53) 13.87 
SD 3.93, 6.15| 5.04| 2,57) 5.29| 4.91 
Total 
Xx 46.53| 36.61| 46.47| 21.68| 29.07| 39.93 
SD 11.38] 16.85| 12.75| 8.73] 15.84) 14.34 


Re ee EASAN BNET EO 
1 Note.—Abbreviations are the same as in Table 


were made between each experimental 
group within each MA level by the New- 
man-Keuls procedure (Winer, 1962, p. 80). 
These tests supported the conclusion that, 
with the exception of the LF group, subjects 
who received instruction performed more 
efficiently on retention and transfer prob- 
lems than did subjects who had not received 
training. Also, subjects with mental ages of 
7-8 years performed more efficiently follow- 
ing strategy training than did those with 
mental ages of 5-6 years. The significant 
MA xX Treatments interaction was ex- 
plained by the finding that subjects with 
mental ages of from 5 to 6 years who re- 
ceived scanning instruction were more 
efficient on retention and transfer tasks 
than were subjects of the same mental age 
who received instruction in the focusing 
strategy. On the other hand, subjects with 
mental ages of from 7 to 8 years who re- 
ceived instruction in focusing were more ef- 
ficient following training than were those of 
comparable mental ages who received scan- 
ning instruction. The data offered no evi- 
dence of differential tasks effects in overall 
efficiency. 


Number of Solutions 

The means and standard deviations for 
the number of solutions on each posttest 
problem for each group are presented in 
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TABLE 5 


MEANS AND STANDARD DEVIATIONS FOR THE 
Numer or SOLUTIONS TO POSTTEST PROBLEMS 


Low MA High MA 
Task 
F S c F S c 
Flowers 
.67 |1.08 | .40 |1.60 |1.40 | .73 
SD 89 | .96 | .74 | .63 | .74 | .79 
Blocks 
X .60 |1.23 |1.00 |1.73 |1.53 | 1.13 
SD 74 | .77 | .98 | .46 | .74 | .91 
Animals 
x .60 |1.00 | .73 |2.00 |1.60 | 1.00 
SD 63 | .84 | .88 | .00| .74 | .93 
Total 
1.20 |3.31 |2.13 |5.33 |4.53 | 2.87 
SD 1.92 |2.84 |2.17 |1.05 |1.88 | 2.29 


Note.—Abbreviations are the same as in Table 
R 


Table 5. Since several violations of homoge- 
neity of variance were noted among the 
separate tasks, the number of solutions for 
each problem was pooled to form a total 
score in an attempt to achieve greater sta- 
bility. Nevertheless, a significant F max (F 
= 4.68, df = 6/14, p < .05) was still found 
between the HF and LS groups. Since this 
difference was impervious to transforma- 
tion, a conservative analysis of variance 
was carried out by using the pooled with- 
in-cell variance without the HF group. Ac- 
cordingly, 15 degrees of freedom were sub- 
tracted from the error term. The MA main 
effect was significant at the .0005 level (F 
= 15.75, df = 1/68), and the treatment 
effect, at the .05 level (F = 3.56, df = 
2/68). Similarly, the MA x Treatment in- 
teraction was significant at the .05 level (F 
= 8.40, df = 2/68). 

. Individual comparisons showed that the 
instruction groups with mental ages of 5-6 
years failed to achieve significantly more 
solutions to retention and transfer problems 
than the control condition with subjects of 
the same mental ages. Nevertheless, the 
data did support the hypothesis that scan- 
ning instruction was relatively more facili- 
tative of concept attainment than was in- 
struction in focusing at the lower MA level, 
Instruction in either strategy at the 7-8 
year MA level yielded significantly more 
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solutions to retention and transfer problems 
than did no instruction. Although the HF 
group was not significantly superior to the 
HS group, the means for these groups fa- 
vored such a trend and undoubtedly con- 
tributed to the significant MA x Treat- 
ment interaction. 


Type of Error 


The means, standard deviations, and per- 
centage of unnecessary trials for redundant 
(R), equivalence (E), and noninformative 
(NI) errors are shown in Table 6. Since the 
assumption of homogeneity of variance 
could not be met in some cases for each 
task taken separately, the error scores on 
each problem were pooled in order to 
achieve greater stability. Table 7 provides a 
summary of the multivariate analysis of 
variance on the three types of errors. The 
test of the MA effect by Wilks Lambda 
Criterion was significant at the .001 level 
(F = 7.154, df = 3/82). Table 7 shows that 
each univariate test of the MA main effect 
was highly significant for each type of 
error. Similarly, each variable showed a 
high degree of correlation with the compos- 
ite scores. Nevertheless, inspection of the 
discriminant function coefficients indicated 
that although all variables seemed to con- 
tribute to the MA effect, redundant errors 
contributed relatively more variance to the 
between-levels effect than did the equiva- 
lence or noninformative errors. 

The overall test of the treatment main 


TABLE 6 
Means, STANDARD DEVIATIONS, AND PERCENTAGE 
OF Unnecessary TRIALS (UT) ror Eacu TYPE 
or Error on Posrrest PROBLEMS 


Low MA High MA 
Error 

F S c F s c 
a oni v acción wm ORIS. 
Re 18.80 0.47 

1 9.85 | 12.80 | 6. 9.13 | 10. 
SD 3.41 | 3:68 | 3:10 | $08 | 4.88 | 3.08 
g UT | 29.05 | 26.80 | 27.54 | 32:00 | 31.42 | 26.21 
x 28.13 | 22.66 | 26.93 | 13.60 | 17.80 | 24.93 
SD 12.80 | 10.80 | 6.90 | 5.78 | 9.72 | 9.35 
UT | 66.04 | 64.28 | 57.96 | 62.76 | 62.38 | 62.43 
4.60 | 4.15 | 6.73 4 " 4.58 
SD 3.60 | 4.51 | 4.85 Lu $15 | £o 
% UT 9.88 | 11.34 | 14.49 | 5.23 | 7.33 | 11.35 


Note.—Abbreviations are the same as in Tables 1 and 3. 
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TABLE 7 
MULTIVARIATE ANALYSIS OF VARIANCE ON Post- 
TEST REDUNDANT, EQUIVALENCE, AND 
NoNINFORMATIVE ERRORS 


Dis- 
crimi- | Tation 
Source — |Eror| MS P | p jM | with 
fonezon am 

Cot 

cient | Posite 
MA (A) R 246. 18.07 | .001 +514 | .906 
Ln E 1152.04 | 17.67 | .001 -385 | .897 
; 3 NI| 144. 10.28 | .002 .277 | .684 
Treatment (B) | R 34. 2.53 | .086 | —.128 | -509 
kd X | 289.23 | 4.44 | .015 | .508 | .820 
NI 71. 5.10 | .008 .648 | .896 
AXB R 75.91 5.56 | .005 672 | .926 
E 320.87 4.92 | .010 .520 | .857 
NI 4. .85 | .705 | —.307 | .287 
Seti es 


Note.—Abbreviations are the same as in Table 3. 


effect by Wilks Lambda Criterion was sig- 
nificant at the .04 level (F = 2.235, df = 
6/164). A second discriminant function was 
removed, but it failed to approach signifi- 
cance (F = .693, df = 2/92, p > .05). Ac- 
cording to Table 7, the univariate analysis 
for equivalence and noninformative errors 
showed significant treatment effects, while 
that for redundant errors only approached 
significance. The discriminant function co- 
efficients and correlations for the treat- 
ment main effect indicated that both equiv- 
alence and noninformative errors contrib- 
uted to the between-treatments sum of 
squares; however, the contribution of re- 
dundant errors appeared to be negligible. 

Similarly, 2 roots were removed in the 
Wilks Lambda test of the MA x Treat- 
ment interaction. The first was significant 
at the .01 level (F = 2.682, df = 6/164); 
the second was not significant (F = 2.067, 
df = 2/82, p > .05). The univariate test of 
the interaction showed significant interaction 
components for redundant and equivalence 
errors, but was not significant with respect 
to noninformative errors. Likewise, the dis- 
criminant function coefficients and correla- 
tions for the interaction showed a high con- 
tribution for redundant and equivalence er- 
rors and a low negative relationship for 
noninformative errors. 


Discussion 


Unfortunately, these conclusions must be 
interpreted in the light of several limita- 
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tions. First, as was noted earlier, the subject 
selection procedure resulted in considerable 
overlap in chronological age between the 
experimental groups. Second, since the 
high-MA groups initially were superior to 
the low-MA groups, it might be argued that 
performance on posttest problems merely 
reflected the initial level of performance. 
Therefore, a series of covariance procedures 
was carried out on each dependent variable 
with CA, and pretest solutions and unneces- 
sary trials as covariates. The regression re- 
moved by CA in every case proved to be 
negligible and it was concluded the results 

reported above could not be attributed to 

uncontrolled variance in CA. Although the 

covariance procedure removed a significant 

amount of regression in each analysis when 

initial performance was controlled, both 

main effects and the interaction remained 

significant. Similarly, the relationship be- 

tween the adjusted cell means was the same 

as that for the unadjusted means. 

Although significant treatment main ef- 
fects were found for both total solutions 
and total unnecessary trials, training did 
not uniformly facilitate concept attainment 
and problem-solving efficiency for all exper- 
imental groups. Since the subjects were 
young educable retarded children, this find- 
ing may reflect the generally low level of 
verbal skills frequently found in such chil- 
dren. Although the daily performance of the 
5-6 year MA focusing and scanning groups 
was not markedly different early in train- 
ing, the low-focusing group showed consid- 
erable difficulty in learning the decision 
tule, that is, the interpretation of positive 
feedback as indicative of an irrelevant 
value. Thus, although the subjects at the 
5-6 year level who were taught focusing 
learned to generate hypotheses and to select 
appropriate instances, they may have been 
unable to interpret the information that 
was conveyed on each trial. On the other 
hand, these results provide support for the 
assertion that retarded children with MAs 
of 7-8 years can be taught effective prob- 
lem-solving strategies for the processing of 
information in a discovery learning task, 
and that the acquisition of such strategies 
has a facilitative effect on their subsequent 
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concept attainment and problem-solving ef- 
ficiency. 
Although a priori predictions were not 
made regarding the differential effects of 
strategy instruction on the specific types of 
errors, inspection of the discriminant func- 
tion coefficients and correlations for each 
variable indicated that redundant and 
equivalence errors contributed relatively 
greater variance to the MA effect than did 
noninformative errors. These data suggest 
that memory load played a greater role in 
the differential performance of the two MA 
groups than did errors in instance selection. 
The multivariate analysis of the treatment 
main effect indicated a relatively greater 
contribution for equivalence and noninfor- 
mative errors than for redundant errors. 
Thus, instruction in either strategy reduced 
the number of equivalence and noninforma- 
tive errors. However, instruction in scan- 
ning was relatively more effective than was 
instruction in focusing in reducing equiva- 
lence and redundant errors at the 5-6 year 
MA level, while focusing was more effective 
than was scanning at the 7-8 year level. 
Consequently, relatively high discrimi- 
nant-function coefficients and correlations 
were found for redundant and equivalence 
errors with respect to the interaction. 
Therefore, it may be concluded that signifi- 
cant increases in problem-solving efficiency 
following instruction may be attributed to 
an increase in informative hypothesis-test- 
ing behavior for both strategy groups, and 
that the relative superiority of either strat- 
egy was determined by the extent to which 
it reduced memory load in the task. 

In summary, little support can be offered 
for the description of cognitive development 
advanced by Piaget (Inhelder & Piaget, 
1958) which contends that formal Teasoning 
begins at year 12 and reaches equilibrium 
at 14-16 years. Rather, the present findings 
suggest that children below the level of for- 
mal operations can acquire and transfer 
rather complex cognitive operations when 
given suitable training. It should be noted 
that studies which have supported the posi- 
tion that formal operations are necessary 
for strategy utilization have not generally 

employed extensive training in such opera- 
tions (Tagatz, 1967; Yudin & Kates, 1963). 
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Similarly, these results are inconsistent 
with those of several previous studies which 
have shown that young children can be 
taught specific operations such as conserva- 
tion before they are developmentally 
“ready,” but that they are unable to trans- 
fer operations to similar problems (Smed- 
slund, 1961; Wohlwill & Lowe, 1962). The 
failure to find significant task effects or 
Tasks x Groups interactions showed that 
strategy instruction facilitated performance 
on problems which involved new instances 
and concepts. These findings tend to con- 
firm those obtained by Anderson (1965) in 
his analysis of transfer following instruc- 
tion in conservative focusing. 

At the same time, the results do not seem 
to support the position that conceptual de- 
velopment proceeds by the learning of more 
complex response capabilities independently 
of the formal properties of the informa- 
tion-processing sequence (Anderson, 1965; 
Gagné, 1968). The youngest children in this 
study were unable to master an indirect 
system of evaluating feedback and, accord- 
ingly, were unable to process information 
over trials. However, children at the same 
developmental level were able to acquire a 
simpler strategy based on the direct test of 
hypotheses but were relatively handicapped 
in retaining information over trials. At the 
7-8 year MA level, both strategies were ac- 
quired and effectively utilized; however, 
relative efficiency was determined by the 
extent to which each reduced memory load. 
Therefore, cognitive complexity seemed to 
play a greater role in understanding per- 
formance at the 5-6 year MA level and 
task complexity a greater role at the 7-8 
year level. The interactive posture of these 
two dimensions in the present study sug- 
gests that future research in conceptual de- 
velopment might attempt to manipulate 
“process” variables at various ages by in- 
struction and then to observe their effect on 
the acquisition and utilization of concepts 
under different conditions of informational 
demand and memory load. 
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INFORMATION DELAY AND RETENTION: 


EFFECT OF INFORMATION IN FEEDBACK AND TESTS! 


PERSIS T. STURGES? 
Chico State College 


Two experiments investigated the effect of delay of informative feed- 
back, immediate tests, and forms of informative feedback upon 7-day 
retention. In Experiment I, four forms of informative feedback dif- 
fered in (a) number of alternatives included, and (b) presence or ab- 
sence of redundant position cues. In Experiment II, informative feed- 
back either (a) identified the correct alternative, with definitions of 
incorrect alternatives; or (b) presented a cue to the correct alterna- 
tive. Superior retention with delayed feedback occurred (a) following 
an immediate recognition test with variable but not redundant feed- 
back, (b) following an immediate recall test with all forms of feedback, 
and (c) following no immediate test with feedback a cue. Following 
an immediate test, with feedback a cue, retention with immediate 


feedback improved and delayed feedback was no longer superior. 


Recent experiments have investigated the 
effect of 24-hour delay of informative feed- 
back upon acquisition and/or retention of 
verbal learning tasks. The subjects had been 
presented a series of items and made an ini- 
tial response to each; and then 24 hours later 
they had been presented the series of infor- 
mative feedback. This 24-hr, delay condition 
has been compared either with a condition in 
which feedback is presented immediately, 
item by item, (Sturges, 1969) or with one in 
which the series of feedback is presented im- 
mediately after the series of items (Sassen- 
rath & Yonge, 1968). The general results of 
these investigations are that at acquisition 
or immediate retention there is no signifi- 
cant difference between delayed and immedi- 
ate feedback but on later retention, delayed 
feedback is superior. 

Although the effects of a 24-hour delay of 
feedback have some similarity to those with 
a shorter delay interval, these investigations 
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differ in several important ways and there is 
reason to question that they can be com- 
pared directly. However, the effects of a 
24-hour delay interval are important be- 
cause (a) the operations are those of a delay 
of informative feedback and thus the results 
offer information on the generality of find- 
ings with shorter delay intervals, and (b) 
the effect of the longer delay interval on re- 
tention is important in understanding fac- 
tors involved in long-term retention. 

An important question in investigating 
factors that lead to improved retention with 
24-hour delays of feedback is: What are 
subjects actually learning with immediate 
and delayed feedback? One possibility is 
that this effect is due to events occurring 
during the delay interval and that in both 
conditions subjects are learning primarily 
or solely the correct stimulus-response as- 
sociation. Thus, with delayed feedback, the 
correct members would be strengthened dur- 
ing acquisition, resulting in improved reten- 
tion. However, this should result in superior 
immediate retention, which has not been 
found. On the other hand, it may be that 
during the delay interval a process occurs 
that is similar to the Zeigarnik effect of an 
incomplete task making the correct stimu- 
lus-response association a more salient as- 
pect and thus resulting in better retention. 

A second explanation is that the difference 
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in retention with immediate and 24-hour de- 
lay of feedback is due to factors operating at 
or following the presentation of informative 
feedback and that what subjects learn with 
immediate and delayed feedback actually 
differs. This hypothesis suggests that sub- 
jects respond differently to informative feed- 
back when it is presented immediately after 
the response than they do when it is pre- 
sented after a delay interval; and that the 
way they respond to the feedback deter- 
mines what they learn and, therefore, their 
retention of the material. With this hypothe- 
sis one possibility is that the additional 
learning postulated to occur with delayed 
feedback results in more precise discrimina- 
tion of the correct choice due to the learning 
of both the incorrect and the correct alterna- 
tives. This interpretation would be similar 
to concept identification in which subjects 
learn to identify the negative as well as the 
positive instances of the concept. A second 
possibility with this hypothesis js that the 
additional learning postulated with delayed 
feedback results in higher order organiza- 
tion of the items in a way similar to that 
found with free recall of individual words 
(Mandler, 1967). It may be that in tasks in 
which a subject must learn to discriminate 
the correct from the incorrect alternatives, 
retention is improved as he identifies rela- 
tionships among the stem, the correct, and 
the incorrect alternatives. According to this 
interpretation, then, the effect of delayed 
feedback would depend upon (a) stimulus 
aspects present during feedback and (b) 
the relevance of these stimuli to the reten- 
tion test. 

Some support for this second interpreta- 
tion was found in an earlier experiment 
(Sturges, 1969). Superior retention with 24- 
hour delay occurred when feedback included 
the incorrect in addition to the correct al- 
ternative but not when it included the cor- 
rect alternative only. Thus, these findings 
support the hypothesis that the effect of 24- 
hour delay of feedback is due to factors 
operating at or following the presentation of 
informative feedback rather than to events 
occurring during the delay interval. How- 
ever, the interpretation of these findings is 
not clear, Removal of the incorrect alterna- 


tives did remove the effect of delayed feed- 
back upon retention. However, the presence 
of the incorrect alternatives was confounded 
with the presence of other cues, that is, the 
position and the letters of the alternatives, 
all of which were also present on the reten- 
tion tests. Therefore, it is not clear whether 
these results were due to the redundancy of 
feedback cues, to the utilization of more 
cues in general, and/or to the knowledge of 
the specific alternatives. 

One purpose of the present experiments 
was to separate these factors. A second pur- 
pose was to identify some additional varia- 
bles involved in the effect of 24-hour delay 
of feedback upon retention. Two experi- 
ments investigated the following questions 
about the effect of 24-hour delay of feed- 
back upon retention. Does this effect differ 
with the form of informative feedback and, 
if so, what cues are most likely to be utilized 
during feedback to facilitate retention? 
Does it depend upon the form of retention 
tests, that is, recall or recognition? Does it 
differ with the presence and form of an im- 
mediate retention test? 

Three delay intervals were compared: 0- 
minute delay, in which feedback was pre- 
sented immediately item by item; 20-minute 
delay, in which the series of informative 
feedback was presented immediately after 
the series of items; and 24-hour delay, in 
which subjects received the series of items 
on the first day and returned 24 hours later 
for the series of informative feedback. These 
three delay intervals permit evaluation of 
the effect of the sequence of experimental 
events (0-minutes) and experimental activ- 
ity during the delay interval (20-minutes) 
in addition to the length of delay. 

Retention was measured by both a recall 
and a recognition test to provide additional 
information about what subjects are learn- 
ing with different delay intervals and forms 
of informative feedback. If superior reten- 
tion with delay is found only with a recogni- 
tion test, it may be due to minimal learning, 
to discrimination among alternatives on à 
recognition level, etc. However, if this effect 
is a result of higher order organization of 
the material, then it should occur with a 
recall test as well as with a recognition test, 
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that is, subjects should then be able to recall 
it with minimal eues at retention. 

In addition, subjeets received either no 
immediate test, an immediate recall test, or 
an immediate recognition test. This provided 
a measure of immediate retention or acquisi- 
tion and also an evaluation of the effects of 
immediate tests upon the 7-day retention 
tests, In both experiments all forms of infor- 
mative feedback provided opportunity for 
the association of the stem and the correct 
alternative. The concern was with the effect 
of additional information presented at feed- 
back. 


Experment I 


Experiment I investigated the effect of 
four forms of informative feedback which 
differed in two ways: (a) the number of al- 
ternatives included; and (b) the presence or 
absence of redundant cues, position, and the 
letters of the alternatives. The redundant 


Initial Presentation: 


Informative Feedback: 
RW+ Right + Wrong-Redundant 


“TO CLEAR FROM BLAME” 
*a. EXCULPATE 
b. LUCUBRATE 
€. LIBRATE 
d. PROPITIATE 


RW Right + Wrong-Variable 


"TO CLEAR FROM BLAME" 
PROPITIATE 
LIBRATE 

*EXCULPATE 
LUCUBRATE 


For the initial presentation, each item consisted of 
the stem with each alternative below it and pre- 
ceded by a letter (a, b, e, d). Each form of infor- 
mative feedback included the stem and the correct 
alternative, which was underlined and had an as- 
terisk to its left. Two forms of informative feed- 
back had the stem and the correct alternative only 
(R, R+), and two had the stem with the correct 
and all incorrect alternatives (RW, RW+). For 
each of these, one was redundant with the letters 
and each alternative in the same position as in the 
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stimuli were not included on the tests and | 
thus were not directly relevant. 
Method 

Design 

Three variables were combined factorially: four 
forms of informative feedback [Right-Wrong Re- 
dundant (RW+), Right Redundant (R+), Right- 
Wrong Variable (RW), and Right Variable (R)]; 
three delay intervals (0 minute, 20 minute and 24 
hour); three immediate test conditions [nothing, 
recall (Re-I), and recognition (Reg-I)]. All sub- 
jects had two 7-day retention tests: recall and 
recognition. The subjects were 468 undergraduates, 

ing & course requirement, randomly assigned 

with 13 subjects in each of the 36 groups. 


Apparatus and Learning Material 


Learning material was a series of 32 multiple- 
choice items with a definition as a stem and four 
uncommon English words as alternatives, The task 
was to learn the correct word for each definition. 
The items were selected with eight from each of four 
word categories: concrete nouns, abstract nouns, 
adjectives, and verbs, The following display shows 
a sample item and the four forms of informative 
feedback, 


“TO CLEAR FROM BLAME” 
a. EXCULPATE 
b. LUCUBRATE 


c. LIBRATE 
d. PROPITIATE 


“TO CLEAR FROM BLAME” 
*a. EXCULPATE 


R Right only-Variable 
"TO CLEAR FROM BLAME" 


* EXCULPATE 


initial presentation. For the two variable forms of 
informative feedback, each alternative was in a 
randomly different position and without the letters. 

The items were presented in the same random 
order for the initial and informative feedback 
presentations. On both immediate and 7-day tests 
the items were in different random orders, and on 
the Tecognition tests the alternatives were in ran- 
domly different positions with no letters preceding 
them. The recognition test presented the stem 
and all four alternatives; the recall test presented 
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the stem only; and in both tests, subjects wrote 
the correct alternative. 

All material was presented on 35-millimeter 
slides by Kodak Carousel slide projector with 
presentation intervals automatically controlled by 
electronic timing units. For the initial presentation 
and the tests, subjects recorded their answers on 
special devices designed so that the answer was 
turned out of view immediately. 


0-min. Delay: 


Session 1: 
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Procedure 


$ Subjects in all groups participated in the follow- 
ing three phases of the experiment: initial presen- 
tation of the material with subjects answering each 
item; presentation of informative feedback; and 
both 7-day retention tests. The following display 
shows the temporal sequence of events for the 
three delay conditions. For subjects with 0-minute 


Item 1 Write R IF, item 1 Rest Item 2... Item 32 Write R IF, item 32 Rest// (Immediate Test) 


15 sec. 15 sec. 15 sec. 10sec. 15sec. 15sec. 


20-min. Delay: 


Session 1: 


15sec 15sec. 10 sec. 


Item 1 Write R Item 2... Item 32 Write R/IF, Item 1 Rest IF, item 2.. . / (Immediate Test) 


15 sec. 15sec. l5 sec. sec. lósec,/7 15 sec. 
24-hr. Delay: 


Session 1: 


10 sec. 15 sec. 


Session 2 (24-hrs. later): 


Item 1 Write R Item 2... Item 32 Write R/IF, Item 1 Rest IF, item 2... / (Immediate Test) 


15 sec. 15 sec. 15sec. 15sec. 15sec./ 15 sec. 


10 sec. 15 sec. 


All Three Delay Conditions, Immediate and 7-day Recall and Recognition Tests: 


Item 1 Write R Item 2... 
15 sec. 15 sec. 15 sec. 


and 20-minute delay of informative feedback, the 
initial presentation of the material and the presen- 
tation of informative feedback occurred in the 
same session. For the 0-minute groups, informative 
feedback was presented immediately item by item. 
For the 20-minute delay groups the series of items 
was followed immediately by the series of informa- 
tive feedback. For the 24-hour delay groups the 
sequence of events was the same as for the 20-min- 
ute delay groups except that subjects received the 
Series of items on the first day and returned 24 
hours later for the series of informative fe 

In all three delay conditions, subjects had the 
same number, type, and length of presentation of 
the initial material and of informative feedback. 
For subjects in the immediate test groups the re- 
tention test (Re-I; Reg-I) was given immediately 
after the series of informative feedback: on Session 
1 for the 0- and 20-minute delay groups; and on 
Session 2 for the 24-hour delay groups. All subjects 
were given a recall test followed by a recognition 
test 7 days after the presentation of informative 
feedback. 


Results 


Immediate Tests 


The mean number of items correct for 
each group for each of the immediate tests 


were analyzed by analysis of variance. The 
effect of delay was divided into two orthog- 
onal components: Dı, (24 hours + 20 min- 
utes) versus 0-min.; and Ds, 24 hours versus 
20 minutes. The effect of form of informa- 
tive feedback was divided into three orthog- 
onal components: F;, right-wrong (RW +, 
RW) versus right only (R+, R); Fa, varia- 
ble (RW, R) versus redundant (RW+, 
R+); and the interaction of these two com- 
ponents, The third main effect was the form 
of test (T). 

The significant effect of delay (F = 14.40, 
df = 2/288, p < .001) was accounted for 
by one component: the combined 24-hour + 
20-minute groups were superior to 0-minute 
delay (Di) (F = 27.47, df = 1/288, p < 
.001). The interaction between this compo- 
nent of delay and form of test (Di X T) 
was significant (F = 7.02, df = 1/288, p < 
01). The effect of delay was more marked 
on the recall test (24 hour + 20 minutes, 
X = 11.19; 0 minutes, X = 6.62) than on 
the recognition test (24 hours + 20 minutes, 
X = 28.26; 0 minutes, X = 26.90). This 
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finding that delay was superior on an im- 
mediate test is contrary to those of other 
studies (More, 1969; Sassenrath & Yonge, 
1968; Sturges, 1969). However, in previous 
studies reporting no effect of delay on im- 
mediate retention, either a recognition test 
has been used or a 24-hour delay was com- 
pared with 20-minute delay. 

The significant effect of form of test (F = 
1115.68, df = 1/288, p < .001) indicated 
superior performance on the recognition test 
to that on the recall test. Also, there was a 
significant interaction between the forms of 
feedback (F, X F2) (F = 4.84, df = 1/288, 
p < .05). When all alternatives were pre- 
sented at informative feedback, redundant 
was superior to variable but when the cor- 
rect alternative only was presented, variable 
was superior to redundant. 


Seven-Day Retention Tests 


Figures 1 and 2 present the mean correct 
for each of the groups and each of the 7-day 
retention tests. These data were analyzed 
by an analysis of variance with the same 
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orthogonal comparisons as for the immedi- 
ate tests; and in addition the effect of the 
immediate test condition was divided into 
two components: Ti, the combined tests 
(Re-I + Reg-I) versus no test; and Ts, 
Re-I versus Reg-I. There was also a within- 
subjects factor, form of 7-day retention test 
(R). 

There was a significant effect of delay 
with both components significant. The com- 
bined 24-hour and 20-minute delay was su- 
perior to 0-minute delay (D,) (F = 30.40, 
df = 1/432, p < 001) and the 24-hour delay 
was superior to the 20-minute delay (D2), 
(F = 9.88, df = 1/432, p < .01). As on the 
immediate tests, the superiority of the com- 
bined delay groups was greater on the recall 
test than on the recognition test (D, x R) 
(F = 5.81, df = 1/432, p < .05). Also, over- 
all performance on the recognition test was _ 
significantly better than on the recall test 
(R) (F = 9515.00, df = 1/432, p < .001). 

There was an overall effect of the immedi- 
ate test conditions, with both components 
significant. Seven-day retention for the com- 
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Fic. 2. Mean correct 7-day recognition tests in Experiment I: delay, feedback and 
immediate test conditions. 


bined immediate test groups was superior to 
that with no immediate test (T1) (F = 
57.69, df = 1/432, p < .001) and retention 
following an immediate recognition test was 
superior to that with an immediate recall 
test (Ts) (F = 12.89, df = 1/432, p < .001). 
There was also a significant interaction be- 
tween immediate test and form of 7-day 
test (T, x R), (F = 5841, df = 1/432, 
P < .001). The superior retention following 
the immediate recognition test was greater 
on the 7-day recognition test than on the 
7-day recall test. 

There was no overall effect of form of 
feedback, but there were some significant 
interaction effects between delay, form of 
feedback, and the other variables. One com- 
ponent of the interaction of form of feed- 
back and type of retention test (Fi X R) 
was significant (F = 6.00, df = 1/432, p < 
-05). Performance on the recall test was 
best when feedback included all alternatives 


and on the recognition test it was best for 
feedback with the correct alternative only. 

Of primary interest is the finding that one 
component of the interaction of form of 
feedback, delay, and immediate test (Fo X 
D; X Ts) was significant (F = 5.64, df= 
1/432, p < .05). The superiority of the com- 
bined 24-hour and 20-minute delay groups 
differed with the form of feedback and the 
immediate test conditions. This is most 
readily seen in Figure 2. Following an im- 
mediate recognition test there is a marked 
relationship between delay and variable- 
redundant form of feedback. Superior reten- 
tion with delay occurred when feedback was 
in variable form but not when it was re- 
dundant. Following an immediate recall 
test the slight superiority of delay with re- 
dundant feedback is due almost solely to 
the inferiority of one group, Right-Wrong 
Variable, 20-minute delay. 

One component of the interaction between 
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form of feedback, delay, and form of reten- 
tion test (Dz X R x [Fi x Fs]) was also 
significant (F = 4.14, df = 1/432, p < .05). 
The superiority of 24-hour delay to 20-min- 
ute delay was a function of the type of re- 
tention test and the interaction of Right- 
Wrong — Right and Variable-Redundant 
forms of feedback. This effect is accounted 
for primarily by the greater superiority of 
24-hour to 20-minute delay on the 7-day 
recall test with Right-Wrong Variable than 
with Right-Wrong Redundant and with 
Right Redundant than with Right Variable 
(see Figure 1). The opposite relationship 
occurred on the 7-day recognition test, al- 
though to a lesser degree. 


ExrzERIMENT II 


Experiment II investigated the effect of 
delay as a function of two different forms of 
. feedback. One form of informative feed- 

back, selected to provide more information 
on what cues are utilized to facilitate reten- 
tion, presented the definition for each incor- 
rect alternative. This condition contrasted 
with both Right- Wrong Variable and Right- 
Wrong Redundant in Experiment I. The ad- 
ditional information in this form of feedback 
was neither the same as in the initial presen- 
tation nor directly relevant to the recogni- 
tion tests. 

The second form of informative feedback 
provided a more direct test of the hypothesis 
that superior retention with delay occurs 


RW-D Right + Wrong-Definitions 


“TO CLEAR FROM BLAME" 
LIBRATE (vibrate) 
PROPITIATE (pacify) 
LUCUBRATE (study laboriously) 


*EXCULPATE 


Results 


Immediate Tests 


The mean correct for each group for each 
of the immediate tests were analyzed by 
analysis of variance with the same orthog- 
onal comparisons for the effect of delay as in 
Experiment I. The only significant effect 
was that performance on the recognition 


STURGES 


because the subjects respond to all informa- 
tion present at feedback after a delay inter- 
val but not when feedback is immediate. In 
this form of feedback the entire item was 
presented, the correct alternative was not 
indicated, but a cue was included which the 
subject could use to find the correct alterna- 
tive. Thus, subjects in both immediate and 
delay conditions should be directed to ex- 
plore all alternatives, and, if this is the fac- 
tor producing superior retention with de- 
layed feedback, this effect should disappear. 


Method 


Design 

Three variables were combined factorially: two 
forms of feedback (Right-Wrong Definitions 
[RW-D]; and Right-Wrong Cue [RW-C]); three 
delay intervals (0-minutes, 20-minutes, and 24- 
hours) ; and three immediate test conditions (noth- 
ing, recall [Re-I], recognition [Reg-I]). All subjects 
had both 7-day retention tests, recall and recogni- 


tion. Subjects were 180 undergraduates, fulfilling . 


& course requirement, randomly assigned with 10 
subjects in each of the 18 groups. 


Learning Material and Procedure 


All material and procedures were identical to 
Experiment I except for the forms of informative 
feedback and the length of presentation of feed- 
back. The following display shows the two forms 
of informative feedback, in both of which the al- 
ternatives were in randomly different positions 
without letters and thus were also variable. For all 
groups, informative feedback was presented for 20 
seconds. All other temporal intervals were the 
same as in iment I. 


RW-C Right + Wrong 


“TO CLEAR FROM BLAME” 
LUCUBRATE 
EXCULPATE 
PROPITIATE 


LIBRATE 
(EX = OUT; CULP = GUILT, 
AS IN CULPRIT) 


test was superior to that on the recall test 
(F = 405.77, df = 1/108, p < .001). 


Seven-Day Retention Tests 


Figures 3 and 4 present the mean correct 
responses for each group for each of the 7- 
day tests. These data were analyzed by 
analysis of variance with the same orthog- 
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Fic. 3. Mean correct 7-day recall tests in Experiment IT: delay, feedback, and 
immediate test conditions. 


onal comparisons of delay and the imme- 
diate test conditions as in Experiment I (D1, 
Ds, T3, T, F, R). 

Overall retention for the combined 24- 
hour and 20-minute delay groups was sig- 
nificantly greater than that for 0-minute de- 
lay (F = 7.37, df = 1/162, p « 01) but, 
contrary to Experiment I, there was no re- 
liable overall difference between 24-hour 
and 20-minute delay. The overall effects of 
the immediate test conditions and the type 
of retention test were the same as in Experi- 
ment I. Retention was significantly better 
on the recognition test than on the recall 
test (F = 4912.19, df = 1/162, p < .001) 
and following the combined immediate tests 
than with no immediate test (F = 21.44, 
df = 1/162, p < .001). Retention was better 
following the immediate recognition test 
than following the immediate recall test 
(F = 885, df = 1/162, p < .01) and this 
effect was significantly greater on the 7-day 
recognition test than on the 7-day 
test (F = 28.88, df = 1/162, p < 001). 

Also, as in ` riment I, there was no 
overall effect of the form of feedback but 
there were some significant interaction ef- 
fects between delay, form of informative 


feedback, and the other variables. One com- 
ent of the interaction between form of 
feedback, delay, and immediate test condi- 
tion (Di X T; x F) was significant (F — 
6.94, df = 1/162, p < .01). The effect of the 
immediate tests upon the superiority of the 
combined 24-hour and 20-minute delay 
groups was essentially the opposite for the 
two forms of feedback. When feedback was 
Right-Wrong Cue, the delay groups were 
superior only when there was no immediate 
test; and, following an immediate test, 7-day 
retention with 0-minute delay improved and 
delayed feedback was no longer superior. In 
fact, under these conditions, 7-day retention 
with 0-minute delay did not differ apprecia- 
bly from that for the delay groups with any 
form of feedback in either Experiment I or 
IL. When feedback was Right-Wrong Defi- 
nitions, delay was superior following the 
two immediate tests only. This finding was 
similar to that in Experiment I for variable 
feedback, indicating that this effect does not 
require that the additional information in 
informative feedback be directly relevant to 
the test. 
One significant component of the four- 
way interaction between delay, form of 
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Fra. 4. Mean correct 7-day recognition tests in Experiment II : delay, feedback, and 
immediate test conditions. 


_- feedback, immediate test condition, and 
form of retention test (Dı X T; x F x R) 
indicated that this latter effect with Right- 
Wrong Definitions was greater on the 7-day 
recall test than on the recognition test (F — 
5.11, df = 1/162, p < .05). A second signifi- 
cant component of the four-way interaction 
(Dz X T; X F x R) indicated that the rela- 
tive superiority of 24-hour to 20-minute de- 
lay differed with the two forms of informa- 
tive feedback, and the form of both the im- 
mediate and 7-day tests (F = 4.81, df = 
1/162, p < .05). 


DISCUSSION 


These findings support the interpretation 
that superior retention with 24-hour delay 
of informative feedback (the delay retention 
effect) is due primarily to factors operating 
at and/or following feedback and not to fac- 
tors operating during the delay interval per 
se. The facilitative effect of 24-hr. delay of 
feedback upon retention varied with the 
form of informative feedback, which fol- 
lowed the delay interval, and thus these 
differences could not be attributed to factors 


operating during the delay interval. More 
specifically, these findings support the inter- 
pretation that the delay retention effect de- 
pends upon (a) stimuli present during in- 
formative feedback, (b) how the subject re- 
sponds to these, and (c) the relevance of 
these stimuli and responses to the retention 
test. However, it is the relationship between 
stimuli presented at feedback and those on 
the immediate retention test that is impor- 
tant, since this effect did not depend upon 
the form of the 7-day tests. Thus, with re- 
dundant forms of feedback, where the addi- 
tional information was not relevant to the 
immediate recognition test, retention did not 
vary with delay of feedback. With variable 
forms of feedback there was a marked su- 
periority of delayed feedback following the 
immediate recognition test. These results are 
consistent with those of Sturges (1969) in 
which superior retention with delayed feed- 
back also occurred when informative feed- 
back presented stimuli relevant to the im- 
mediate retention test. Also, in Experiment 
TI, superior retention with delayed feedback 
was changed by manipulating the form of 
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feedback. When informative feedback was 
presented in a form that required subjects 
with immediate feedback to respond to more 
than the correct alternative at feedback, 
their retention was improved and delayed 
feedback was no longer superior. Thus, these 
findings support the interpretation that the 
delay retention effect occurs because Ss with 
delayed feedback respond to more cues or 
stimulus aspects of informative feedback, 
thus learning more about the item; and that 
when these cues can be used in retention, de- 
layed feedback improves retention. 

Which cues are most likely to be utilized 
to facilitate retention also depends upon the 
presence and form of the immediate test as 
well as the delay of feedback. With no im- 
mediate test, superior retention with delayed 
feedback occurred only when feedback pre- 
sented a cue. With an immediate recall test, 
the delay retention effect occurred with all 
forms of feedback except that with a cue; 
and with an immediate recognition test it 
occurred only when feedback was in varia- 
ble form. These results are consistent with 
the following interpretations. With delayed 
feedback, subjects acquire additional infor- 
mation presented at informative feedback. 
The utilization of this information to facili- 
tate retention depends upon the kind of in- 


"It is tempting to postulate some mechanisms 
involved, but it is difficult to do so with any degree 
of rigor. My general point is that after a 24-hour 
or 20-minute interval, subjects respond differently 
to the “same” informative feedback than when it 
is presented immediately item by item. How can 
this be? Why should it matter? My approach to this 
is that presentation of information about the correct 
alternative, that is, informative feedback, functions 
asa stimulus to which a subject responds and his re- 
sponse ie what he learns, Also, the stimulus of in- 
formative feedback is presented in the context of 
the subject’s response to the preceding stimuli. 

en a subject makes a response to an item, 
whether it is a guess as on the first presentation or & 
response in a test situation, it is likely that this 
would lead him to ask “am I correct?”. Thus, when 
informative feedback is presented immediately, his 
response to feedback in this context may well be to 
the fewest stimuli necessary to determine “I got 
that right”, for example, position of the correct al- 
ternative in the redundant form, and this may be all 
he is learning, After a delay interval, the subject 
must read the stem, at least, to determine what the 
item is and apparently he often reads all alterna- 
tives and, thus, may be learning the stem, the cor- 
tect, and the incorrect alternatives. 


formation acquired and what follows imme- 
diately, that is, the immediate test. Thus, 
when informative feedback is a cue, infor- 
mation acquired with delayed feedback is 
sufficient to facilitate later retention with 
no immediate practice. When feedback indi- 
cates the correct alternative and provides 
additional information, subjects with de- 
layed feedback acquire this information. 
However, utilization of this information to 
facilitate retention requires some immediate 
practice. The immediate recall test provides 
immediate practice with minimal cues and 
thus any additional information can be uti- 
lized and later retention is facilitated, 
whether this information is relevant to the 
later test or not. An immediate recognition 
test provides immediate practice with the 
entire item presented. If the previous infor- 
mation acquired with delayed feedback is 
incompatible with the cues presented on 
this test, this information cannot be utilized 
directly on the recognition test and thus 
there is no facilitation on later retention, re- 
gardless of the form of the later test. If the 
information previously acquired with de- 
layed feedback is not incompatible with the 
stimuli or cues on this test, it can be used 
and later retention is facilitated. 

According to this interpretation, with 
most forms of informative feedback subjects 
with immediate feedback do not acquire 
much information at feedback and immedi- 
ate practice does not facilitate later reten- 
tion. Even when informative feedback 
presents a cue, subjects with immediate 
feedback learn this additional information 
at a minimal level and some immediate 
practice is required to facilitate later reten- 
tion. 

What are subjects actually learning with 
delayed feedback that facilitates retention? 
Although different forms of informative 
feedback presented different information in 
addition to the correct stimulus-response 
words, the effect of this additional informa- 
tion may have been primarily to make more 
salient the correct alternative. In Experi- 
ment I there was no differential affect of de- 
lay for the two variable forms of informa- 
tive feedback, the correct alternative only 
or the correct and incorrect alternatives. 
Thus, superior retention with delayed feed- 
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back did not depend upon the presentation 
of the incorrect alternatives per se but 
rather upon the presentation of any changed 
or additional relevant cues at feedback. 
Some evidence on the question of what is 
learned is provided by the kind of errors 
made on the recall test. The percentage of 
errors that were incorrect alternatives from 
the same item was computed. Figure 5 
shows these data for the 7-day recall test 
in Experiment I. Again the immediate test 
conditions made a difference. Following an 
immediate recognition test there is a marked 
relationship between the percentage of er- 
rors of this kind, the delay interval, and 
form of informative feedback. With 24-hour 
delay there was an increase in percentage of 
errors that were incorrect alternatives when 
feedback had included all alternatives. With 
0-minute delay there was essentially no re- 
lationship between the kinds of errors and 
form of informative feedback. These find- 
ings support the interpretation that, after a 
delay interval, subjects actually acquire 
more of the information presented at feed- 
back, that is, the incorrect alternatives as 
well as the correct. 

Additional evidence on what subjects are 
learning with delayed feedback is provided 
by the two retention tests. Superior retention 
with delay, both overall and as a function of 
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the form of feedback, oceurred on both a re- 
call and recognition test, and in some condi- 
tions it was significantly greater on the re- 
call test. Thus, the effect of delay is not due 
solely to improved discrimination among al- 
ternatives at a recognition level or to mini- 
mal retention, either of which would require 
the entire item to be presented. Rather, 
whatever is learned with delay is available 
with minimal cues at retention. The recall 
test was used as a measure more sensitive 
to organization of or among the alternatives, 
and these findings provide some support for 
this assumption. In Experiment I overall 
performance on the 7-day recall test was 
best when informative feedback consisted of 
all alternatives, although on the recognition 
test it was best with the correct alternative 
only. Also, the interaction between the form 
of the immediate and 7-day tests indicates 
that optimal retention with minimal cues 
requires more than repetition of the correct 
alternative. Performance on the 7-day recall 
test was at least as good following an im- 
mediate recall test as after an immediate 
recognition test, even though the mean 
number of correct responses given on the 
immediate recall test was markedly less 
than that given on the immediate recogni- 
tion test. Thus, the process of producing the 
correct response contributes considerably 
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Fra. 5. Percentage of errors, incorrect alternatives from same item, 7-day recall tests in 
Experiment I: delay, feedback, and immediate test conditions. 
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more to later recall than that of selecting or 
identifying the correct alternative. 

Experiment II also provides evidence that 
superior retention is due to additional infor- 
mation acquired at feedback. When infor- 
mative feedback was a cue, subjects had to 
read the cue and the alternatives to find the 
correct alternative and thus it would be ex- 
pected that they responded to all alterna- 
tives. Also, the cue itself could be used 
either as a direct associative link between 
the stem and the correct alternative or as & 
basis for organizing the units of the item, 
that is, for identifying relationships between 
the stem and the correct alternative and/or 
the incorrect alternatives. Under these con- 
ditions retention was optimal for delayed 
feedback with no immediate test and for 
immediate feedback following immediate 
tests. Thus, the present findings are con- 
sistent with the interpretation that reten- 
tion of the correct stimulus-response words 
is facilitated when subjects have identified 
relationships among the stem, the correct, 
and the incorrect alternatives. 

Three delay intervals were included to 
help identify factors involved in the effect of 
24-hour delay of feedback. The present find- 
ings indieate that both the sequence of 
events and the length and lack of experi- 
mental activity during the delay interval 
contributed to superior retention with de- 
layed feedback. However, the sequence of 
events contributed more. It appears that 
when informative feedback is presented im- 
mediately, item by item, subjects acquire 
the least information necessary to determine 
the correctness of their previous response. 
It is as though their response to feedback is 
merely “I got that right,” or “I got that 
wrong,” and this may be primarily what 
they are learning. In order to improve reten- 
tion it seems to be necessary that the presen- 
tation of informative feedback be such that 
subjects acquire information about the item 
that is relevant to its retention. Apparently, 
this can be accomplished either by delaying 
the presentation of informative feedback or 
by manipulating the form in which feedback 
is presented, for example, with a cue. 

In conclusion, it seems that the effect of 
the 24-hour delay interval is best interpreted 
as the effect of spacing of learning events. 


According to this interpretation the infor- 
mation presented at feedback in combina- 
tion with the spacing between initial presen- 
tation and informative feedback determines 
how subjects respond to feedback and what 
they acquire at the presentation of feed- 
back. The kind of information acquired at 
informative feedback in combination with 
the kind of immediate practice determines 
what is retained. 

These findings also suggest the following 
hypotheses about long-term retention of 
meaningful material. For optimal retention 
under conditions of minimal cues, mere repe- 
tition of the response to the stimulus word 
is not sufficient. Rather, long-term retention 
is improved when conditions are such that 
subjects identify relationships between the 
to-be-remembered units and other possible 
alternatives. Perhaps a kind of network is 
developed in which the correct response is 
integrated with incorrect alternatives; and 
long-term retention is better when there is 
such a network than when subjects have 
acquired the correct alternative only. Ac- 
cording to this interpretation, the spacing 
of learning events and the information pre- 
sented at informative feedback and immedi- 
ately following it would be important in 
providing opportunity for the development 
of such a network. Optimal retention under 
conditions in which many cues are present 
is not so dependent upon organization of the 
material. However, even in this case reten- 
tion is facilitated when learning conditions 
are such that some exploration of the mate- 
rial occurs. 
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Data were gathered over a 3-year period from more than 300 black 
and white four- and five-year-olds attending prekindergartens in 
poor, urban neighborhoods of Atlanta, Georgia. From these data a 
correlation matrix was generated and factor analyzed. Five factors 
emerged and were replicated statistically: (a) verbal facility; (b) cop- 
ing with anxiety by withdrawal; (c) coping with anxiety by aggres- 
Sion; (d) alienation; and (e) biological sex. For both boys and 
girls, the variable cluster associated with a was negatively corre- 
lated with the b cluster. The variable cluster associated with a was 
negatively correlated with c only for girls. The results were inter- 
preted to mean that coping by withdrawal indicates personality mal- 
adjustment and interferes with verbal facility. Although coping by 
aggression does not directly interfere with verbal facility, girls with 


high verbal facility choose other means of coping with anxiety. 


The crucial importance of the noncogni- 
tive dimensions of personal function as pre- 
dictors of academic success has sparked 
many studies relating cognitive measures, 
measures of learning, and measures of aca- 
demic performance to personality variables 
of all types (e.g, Lakarezyk & Hill, 1969). 
Informal documentation also indicates the 
importance of noncognitive variables as 
predictors: When asked to choose what in- 
gredients were most likely to lead to school 
and life success, both teachers and parents 
picked social skills, goal directedness, and 
emotional stability, rather than IQ or apti- 
tude as the most worthwhile qualities 
(Getzels & Jackson, 1961). Personality 
traits play a major role in both success or 
failure in school and in quality of later-life 
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adjustment. For slum children, who appear 
to be less influenced than economically ad- 
vantaged children by what tratitional cog- 
nitive instruments measure (see Jensen, 
1968), the need to identify noncognitive 
correlates of success is crucial. 

Studies in which socialization variables 
have been related to a criterion, whether 
that criterion is school success, occupational 
status as an adult, or personality disorders, 
have typically included specific measures of 
socialization processes compared, one at a 
time, to particular behavioral indices. Al- 
though this procedure is effective in estab- 
lishing individual correlates of a criterion 
measure, data of this sort are not likely to 
reveal more fundamental, most probably 
quite complex, dimensions within the do- 
main of possible independent measures that 
may relate more closely to the performance 
criterion. There are other disadvantages re- 
lated to using special purpose measures, re- 
gardless of whether they are based on “mere 
face validity” or on the “correlational test- 
ing of predictive validity.” (Cattell, 1965, 
p. 81): First, “the combination of traits 
which worked in one kind of group may not 
work in another [p. 81].” This difficulty is 
particularly evident in prediciting learning 
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ability from IQ scores for children of differ- 
ent socioeconomic status (SES) levels. Sec- 
ond, characteristics peculiar to a particular 
measure (uniqueness) are given too much 
weight. For example, the child who has been 
practicing a list of words and their defini- 
tions intensely does better on a test of rec- 
ognition vocabulary than one who has not, 
even though the latter may be the more 
verbally skilled. The specificity of a partic- 
ular instrument is minimized by employ- 
ment of a battery of measures covering a 
broad spectrum of behaviors. 

Third, when only specific correlational 
relationships are known, particularly the 
low-level relations typically found in per- 
sonality research, many variable-specific 
theories and models may be generated that 
can be related only to the research that has 
produced them. The net result has been that 
psychologists do not agree on a limited set 
of variables to describe and predict behav- 
ior, nor do they agree in their manner of 
defining variables already in common usage 
(Horst, 1966). Hence, the fragmentation of 
research effort that characterizes much of 
the cognitive literature is likely to become 
the hallmark of the socialization-personal- 
ity area unless empirically established di- 
mensions, recognizable across age levels, in- 
dividual measures, and culturally divergent 
groups, begin to appear in the literature 
(Cattell, 1965). 

„One of the best methods of establishing 
dimensions for data matrices is factor anal- 
ysis. Factor analytic techniques provide 
models that are based upon dependencies 
inherent in the data, and are particularly 
useful whenever the number of variables is 
too large to be readily interpretable by first 
order statistical techniques, in fields or do- 
mains of enquiry where there has been little 
work indicating the structure and interrela- 
tionships among the measures being used, or 
whenever it is required to unravel a struc- 
ture of dependence where there are no à 
Priori patterns of causality evident (Cat- 
tell, 1966a; Morrison, 1967). 

The role of cognitive variables in the de- 
velopment of personality, particularly ver- 
bal intelligence, remains ambiguous (Cat- 
tell, 1965). However, much interest has 


been generated in how personality measures 
and tests of intelligence relate to each other. 
Thus, it is profitable to regard the personal- 
ity domain as including verbal intelligence 
rather than excluding it. Determining how 
socialization and verbal competence inter- 
act can only be accomplished by concurrent, 
analyses including both kinds of measures. 
Hence, the present study is focussed on the 
augmented personality domain, including 
both socialization dimensions and verbal 
abilities. 
MetHop 


Subjects 


Sample 1. Approximately 220 black and white 
four- and five-year-olds from 11 schools in urban, 
Atlanta, Georgia, participated in the testing pro- 
gram. Most of these children were from poor 
neighborhoods and were disadvantaged from the 
standpoint of parental income. 

Sample 2. In a previous testing program, 120 
black and white four- and five-year-olds from three 
experimental prekindergartens in Atlanta served 
as subjects. These boys and girls attended three 
Educational Improvement Project schools during 
the years 1966, 1967, or 1968. Like their counter- 
parts in Sample 1, these subjects were disadvan- 
taged from the standpoint of parental income. 


Demographic and Parent-Child Variables 


The first five demographic variables are: sez, 
race, chronological age, father absence, and mother 
absence. 

Birth order. Although order of birth is one of 
the most widely studied demographic variables, 
it remains to be seen if the most general findings 
pertaining to this variable can be applied to slum 
children. 

Household density. The authors reasoned that 
density is a plausible measure of the degree of 
competition experienced by the child for available 
commodities in the home (eg, affection from 
parents, the limited supply of material possessions 
belonging to the family, and living space). Scores 
were obtained by dividing the number of persons 
living in the child's home by the number of rooms 
occupied. È h 

Occupational status, Because most children in 
the present investigation come from poor neigh- 
borhoods, have parents with low-quality educa- 
tions, and are generally similar to each other on 
most commonly used indicators of social class, oc- 
cupational status alone was used to indicate rela- 
tive levels of socioeconomic status. Data for this 
measure were obtained from parent interviews and 
teacher supplied information sheets. 

Tangible enrichment. For a slum child, an, 
*"impoverished" environment is not necessarily one 
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that lacks stimulation. On the contrary, it "seems 
to be one of overwhelming but undifferentiated 
stimuli: too many people in too little space, TV 
sets blaring indiscriminately, lack of organized and 
orderly meal times, lack of opportunity to converse 
in depth or variety with either brothers and sisters, 
parents, or other adults, cluttered houses, ...and 
other such factors [McCandless, 1967, p. 161].” 
Thus, the present measure of tangible enrichment 
is intended to serve as a rough estimate of the 
amount of organized stimulation present in the 
child’s home environment. Families that provide 
access to library cards and supplementary reading 
materials were judged to provide more tangible, 
functional enrichment than homes without such 
provisions. A child’s score for this measure can 
range from 1 (neither library cards nor extra read- 
ing material provided) to 3 (indicating the pres- 
ence of both). 

Interest in. child's education. Parents or guard- 
ians who frequently talk to teachers about their 
child’s progress in school, who have attended 
parents’ meetings at school, and who require their 
children to attend school even when it is incon- 
venient to do so, were rated high in interest in 
their child’s education. Scores for this measure can 
range from 1 to 5, a higher score indicating greater 
interest, 

Peer Popularity 

Picture Sociometric Technique, Like. The 
amount of preference shown for a child by his age 
mates can be related to other personal-social 

characteristics of that child (eg, Marshall & 
McCandless, 1957; McCandless, Balsbaugh, & 
Bennett, 1958; Moore, 1967). Records of early 
peer acceptance in elementary school have also 
been found to predict adult social adjustment 
(Roff, 1961). 

In the present investigation, the McCandless- 
Marshall Picture Sociometric Technique, Like was 
used to measure popularity (McCandless & Mar- 
shall, 1957). This technique has been found “to 
yield data that are both valid and reliable [Moore, 
1967, p. 4]." Sociometric-Like scores were obtained 
with the aid of a picture board displaying photo- 
graphs of every child in a given class, As in other 
studies using this technique (eg. Moore & Up- 
degraff, 1964), the task was enjoyed by both chil- 
dren and experimenter, data were obtained with 
dispatch, and the data were for the most part re- 
liable. Test-retest reliability coefficients, based on 
a time interval of 10 days, have been reported by 
McCandless and Marshall (1957). 

Picture Sociometric Technique, Dislike. The 
interpretation of the low end of the Sociometric- 
Like scale is hazardous; “the only thing that can 
be said with confidence about children with low 
scores is that they were children who did not get 
many positive nominations [Moore, 1967, p. 5].” 
Such children may either be actively disliked by 
their classmates, or simply ignored. If low-visi- 
bility children are to be distinguished from disliked 
children, it is necessary to solicit negative as well 
as positive votes. Sociometric-Dislike scores were 


obtained in the same manner as those for Socio- 
meiric-Like, except that the subjects were asked 
for negative votes. Reliability was found to be con- 
sistently lower for Dislike than for Like (see 
Richards, 1970). 


Sex Role Reference 


Brown's (1956) IT Scale for Children (YT Scale) 
was administered to all children projectively lie, 
IT was kept in the envelope). The projective ad- 
ministration was preferred because of the possi- 
bility that “the IT figure contains predominantly 
masculine cues and that these cues have a sig- 
nificant influence on children’s, especially girls’ 
ITSC scores [Sher & Lansky, 1968, p. 328]." 


Children’s Self-Social Construct Test 


The 12 scales of the Children’s Self-Social Con- 
struct Test (Self-Social Construct Test) were “de- 
veloped as a part of a research program which has 
applied a non-verbal method of assessment of self 
and social constructs to a variety of problems and 
populations [Long & Henderson, 1968, p. 6)” 
Split-half reliabilities ranging from 48 to .73 have 
been obtained for the Preschool Form (Long & 
Henderson, 1907), the version that was used in 
the present investigation. For this measure, it is 
assumed that a child relates himself symbolically 
to the social configurations presented as he re- 
sponds to items. A description as well as coeffi- 
cients of concurrent validity for each scale indi- 
vidually can be found in the testing manual 
(Long, Henderson, & Ziller, 1970). The scales 
used in the present study include the following: 
Esteem; Dependency ; Identification with Mother; 
Identification with Father; Identification with 
Friends; Identification with Teacher; Realism of 
Color; Realism of Size; Forced Choice, Mother; 
Forced Choice, Father; Forced Choice, Friends; 
Forced Choice, Teacher. 


Intensity of Task Involvement Scale 


The Hodges and McCandless Intensity of Task 
Involvement Scale (Task Involvement) is based 
upon 5-second observational samples of individuals 
engaged in teacher assigned work activities, In the 
present study, 15 or more such samples were re- 
corded by raters for each child appearing in Sam- 
ples 1 and 2. Each sample was judged by the ob- 
server to fall into categories ranging from (1) 
totally unoccupied behavior (unoccupied), to (6) 
complete task involvement (complete). Task In- 
volvement scores represent the mean ratings over 
all the 5-second intervals. 


Peabody Picture Vocabulary Test (PPVT) 


Standard instructions for the PPVT were fol- 
lowed as closely as possible in the school settings 
encountered by the testers (see Dunn, 1965). 


Teacher Rating Scales 


Teachers in each of the preschools included in 
the present investigation were requested to com- 
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plete Personal-Social Adjustment Rating Scales 
(Goldstein & Chorost, 1966), a series of 17 meas- 
ures adapted from the Office of Economie Oppor- 
tunity Project Head Start Teacher Rating Scales. 
Each of the first eight of these measures (Goldstein 
Rating Scales) provided teachers with five mutu- 
ally exclusive category descriptions that could be 
fitted to every class member. For the remaining 
nine Goldstein Rating Scales, each teacher was 
presented with a descriptive paragraph about a 
hypothetical child who epitomized the trait im- 
plied by the name of the rating scale. Possible 
scores for these nine rating scales range from 1 to 
5, lower scores indicating “very much like the 
trait descriptions,” and higher scores indicating 
“not at all like descriptions.” The following scales 
were included: Quality of Speech; Peer Relation- 
ships; Independence; Restraint of Motor Activity ; 
Cooperation; Aggression; Active vs. Passive 
Speech; Verbal Skills; The Silent Child; Child 
with Separation Problems; Fearful or Tearful 
Child; The Isolated Child; The Child Who 
Doesn't Learn; The Provocative Child; The Un- 
happy Child; The Disruptive Child; The Hyper- 
active Child. Interjudge reliabilities based upon 
product-moment correlations of ratings made by 
two teachers independently together with a com- 
plete description of each scale are presented by 
oai (1970). Reliabilities ranged from 46 to 


Analysis 1 


Although data for some measures were avail- 
able for all 220 children in Sample 1, only those 
with a score from every measure, or every meas- 
ure except one, were included in this analysis. The 
effective sample was 181 cases, and the data were 
998% complete. A subsidiary analysis based upon 
children with missing scores indicated that, with 
respect to means and standard deviations, this 
group differed little from the complete data group. 
The following steps were followed: 

1. A correlation matrix was generated and a 
Principal components solution (without rota- 
tion) obtained using selected options of com- 
puter program BMD 03M (Dixon, 1965). 

2. Each of the first 20 eigenvalues of the 
matrix was plotted against the serial order in 
which it appeared. A scree test was performed 
to determine the probable number of factors 
Necessary for a parsimonious, yet complete in- 
terpretation of the data (Cattell, 1900b). The 
scree test indicated that seven factors should 
be specified. 

3. BMD X72 (revised, June 18, 1968), under 
selected options, performed a factor analysis on 
the data yielding a seven factor solution. 

4. To insure that the data were fully inter- 
Preted, eight and nine factor solutions were also 
obtained, 


5. The data were sorted according to sex. 
Four additional solutions were obtained: seven 
factor, boys; seven factor, girls; eight factor, 
oys; eight factor, girls. These solutions were 
based upon data from 86 boys and 95 girls. 


6. All these solutions were rotated ortho- 
gonally according to the Varimax criterion. 


Analysis 2 


Complete sets of scores for 38 of the measures 
described previously were available for 74 sub- 
jects in Sample 2. Although seven- and eight-factor 
solutions were obtained for these data, a break- 
down by sex was not feasible due to the small 
sample size. As in Analysis 1, these solutions were 
rotated orthogonally. 


Analysis 8 

Harman (1967) has provided a direct, means of 
determining degree of factorial similarity by the 
use of the coefficient of congruence (#). A coeffi- 
cient of congruence was calculated for each factor 
individually across Analyses 1 and 2. 


RESULTS 


Analysis 1 

The seven-factor solution for the total 
data matrix of Analysis 1 accounts for 41% 
of the total variance in the measurement 
battery of 43 variables. Communalities 

from a low of .09 (realism of color 
scale of the Children’s Self Social Construct 
Test) to a high of .81 (teacher ratings of 
the child who is isolated), with a median of 
.44. Measures with loadings exceeding .25 
on any of the resulting dimensions are 
shown in Table 1. Both eight- and nine-fac- 
tor solutions produced a Heywood case, 
and thus were mathematically unsatisfac- 
tory (Harman, 1967). 

The seven-factor solution for females (n 
= 95) accounts for 44% of the total vari- 
ance of the battery, with communalities 
ranging from .09 (chronological age) 
through .88 (teacher ratings of the unhappy 
child) with a median of .49. Measures with 
loadings greater than .25 on any of the re- 
sulting factors are shown in Table 2. 

The seven-factor solution for males (n = 
85) accounts for 45% of the total variance, 
but this solution is mathematically unsatis- 
factory due to the occurrence of a commu- 
nality greater than unity. Therefore, it was 
necessary to interpret the results for males 
in terms of a six-factor space rather than 
the usual seven. The six factors shown in 
Table 3 account for approximately 42% of 
the total variance of the battery. Although 
Factors I and II were correlated, the orthog- 
onal scheme of rotation as a whole fitted 
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TABLE 1 
Factor STRUCTURE OF ANALYSIS 1 


Variable 


Factor 


Verbal skills 

Quality of speech 

Active vs. passive speech 
Peabody Picture Vocabulary Test 
Sociometric-Like 

Independence 

High occupational status 

High tangible enrichment 

High self-esteem 


The isolated child* 

The unhappy child* 

The fearful or tearful child* 

The silent child* 

The child with separation problems* 
Peer relationships* 

The child who doesn’t learn* 

High household density 


Aggressive reactions* 

The provocative child* 

The disruptive child* 
Cooperation* 

Restraint of motor activity* 
The hyperactive child* 


Low identification with friends 
Low identification with teacher 
Low identification with father 

Low identification with mother 


Biological sex (female direction +) 
It test; (male direction +) 

Forced choice, father 

Forced choice, teacher 


Father present 
Chronological age 
Mother present 

Realism of size 

Race (white direction 4-) 


Forced choice, mother 
Forced choice, friends 
Realism, color 


.30 
.89 


.96 


—.86 
—.89 


—.80 
—.19 
—.76 
—.75 
—.67 
—.65 


41 
-70 


.28 5-164 


—.98 


Note.—All loadings below | .25 | are omitted. 


* Socially acceptable direction is positive. 


the data well. It is apparent from these re- 
sults that the common factor space for male 


Analysis 2 
The seven- and eight-factor solutions for 


subjects is less clearly differentiated than Sample 2 data account, respectively, for 


for female subjects. 


48% and 51% of the total variance of the 38 
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TABLE 2 
Factor STRUCTURE OF ÁNALYsIS 1 FOR FEMALES 


Variable 


Factor 


n n IV y VI VI 


Quality of speech AT 
Verbal skills -74 
Restraint of motor activity 63 
Independence .58 
Peabody Picture Vocabulary Test 57 
Sociometric-Like E 
Tangible enrichment 85 
High self-esteem 31 


The isolated child* 
The unhappy child* 
The silen + child* .28 
Active vs. passive speech* .54 
The child with separation problems* 
The fearful or tearful child* 

Peer relationships* 

Household density 

The child who doesn’t learn* .36 
Intensity of task involvement 


Aggression* 

The provocative child* 
The disruptive child* 
Cooperation* .42 
The hyperactive child* 


Low identification with friends 
Low identification with father 

Low identification with teacher 
Low identification with mother 


Forced choice, friends 
Forced choice, mother 


Father present 
High occupational status -33 
Mother present 

Realism of size 

Interest in child's education 
Race (white direction +) 


Forced choice, father 
Forced choice, teacher 

It test (female direction —) 
Realism of color 


—.88 


eE 
igi 


BEBERERERE 


I 
»* 
č 


Note.—All loadings below | .25 | are omitted. 
* Socially acceptable direction is positive. 


variables included. Both these solutions 
were satisfactory statistically, and inspec- 
tion of the resultant factor loadings indi- 
cated that five of the Analysis 2 factors 
were remarkably similar to their counter- 


parts in Analysis 1. However, in this anal- 
ysis, Factor II was found to be split into 
Factors II A and II B in the eight-factor 
solution. These fragments were collapsed 
onto a single vector to facilitate direct com- 
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TABLE 3 
Factor STRUCTURE OF ANALYSIS l FOR MALES 


Variable 


m 


Verbal skills 

Active vs. passive speech 
Quality of speech 
Sociometric-Like 

"The child who doesn’t learn* 
Peer relationships 

Peabody Picture Vocabulary Test 
High occupational status 
Forced choice teacher 
Independence 

High tangible enrichment 
Realism of size 

High Self-Esteem 


The isolated child* 

The fearful or tearful child* 

The silent child* 

"The unhappy child* 

The child with separation problems* 


BS b BRBBREREBERSARAR 


Aggression? 

The disruptive child* 

The provocative child* 
Restraint of motor activity* 
Cooperation* 

The hyperactive child* 


Low identification mother 
Low identification friends 
Low identification father 
Low identification teacher 
Mother present 


Forced choice, father 
Forced choice, friends 
Household density 


It test (male direction +) 
Race (white direction +) 
Sociometric-Dislike 
Chronological age 
Forced choice, mother 
Father present 


Factor 


Hn m IV v VI 


—.80 


I 


—.81 


— .39 


—.79 
—.19 
=.77 
—.74 
—.73 
— .65 


-.1 


& 
N 


l 


I 
8 
RERSES 


Note.—All loadings below | .25 | are omitted. 
* Socially acceptable direction is positive. 


parison with Factor II of Analysis 1. Col- 
lapsing was done graphically, the resulting 
vector drawn at a 45° angle to both II A 
and II B. 


Analysis 3 


The resulting coefficients of congruence 
across the two analyses are as follows: Fac- 


tor I, .83; Factor II, .91; Factor III, .93; 
Factor IV, .87; Factor V, .63; Factor Vl, 
43. Factor VII was not replicated by Anal- 
ysis 2. 

Discussion 
Factor I 


The four highest loadings on Factor I of 
Analysis 1 (see Table 1) include the follow- 


ay 
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ing: verbal skills, r = .74; quality of 
speech, r = .71; activity versus passivity of 
speech, r = .58; and verbal intelligence 
(PPVT), 7 = .53. These loadings, the major 
determiners of this vector, include only 
ability variables concerned with verbal 
communication. Thus, Factor I has been 
named “verbal facility,” a component that 
resembles Thurstone’s (1946) “verbal com- 
prehension” vector. Other variables that re- 
late to Factor I include measures indirectly 
associated with verbal facility: learning 
easily, r = .39; high restraint of motor ac- 
tivity, r = .36; and not silent, r = .33. This 
vector was fully replicated in Analysis 2 
( = .83). 

The finding that the Picture Sociometric 
Technique, Like loads primarily on Factor 
I (r = 48) is consistent with the results of 
several studies relating SES, school per- 
formance, and intelligence to various meas- 
ures of peer acceptance (e.g. Barbe, 1954; 
Davis, 1957; Elkins, 1958; Laughlin, 1954). 
Significant (p < .01), but low-level first 
order correlations ranging from .22 to .26 fit 
well with the findings that children who are 
the most popular are those who are more 
cooperative, happier, and less isolated in 
their play activities (Thompson, 1962). 
However, Sociometric-Like scores relate to 
the factor more in the manner of an ability 
than a temperament. It appears from these 
results that being able to communicate 
effectively with others (i.e., high verbal fa- 
cility) is an important part of picture 80- 
ciometric status among slum preschoolers. 
This finding is in accord with Dunnington's 
(1957) conclusion that verbal interaction is 
à major ingredient in social acceptance 
among nursery school children. Although a 
sizable proportion of Sociometric-Like vari- 
ance (about 23%) is attributable to verbal 
facility, teachers’ ratings of good peer rela- 
tionships are less influenced by this factor 
(r = 30). Furthermore, the first order cor- 
relation between Sociometric-Like and 
teacher ratings of peer relationships 18 only 
-19. Thus, teacher judgments of solitary, 
Parallel, and cooperative play are not sub- 
une related to picture sociometric sta- 
us, 


Many investigators have reported signifi- 
cant correlations between socioeconomic 


status and various cognitive measures (Jen- 
sen, 1968). The present investigation pro- 
vides still further evidence for this relation- 
ship. Even for the restricted range of parent 
occupations in Sample 1, the variable of oc- 
cupational status was found to be loaded 
moderately on verbal facility (r = .41). 
Despite the crudeness of the measure, tan- 
gible enrichment was also found to load on 
Factor I (r = .33). This findings is particu- 
larly agreeable to theorists who emphasize 
the role of environment in intellectual de- 
velopment (e.g., McCandless, 1964), al- 
though such results are not in conflict with 
other viewpoints. 

On the basis of Jensen’s (1969) review of 
the literature concerning intelligence test- 
ing, it was anticipated that race would load 
at least moderately on Factor I. This expec- 
tation was not supported by Analysis 1 (r 
= .03). Even the first-order correlation be- 
tween race as a variable and the Peabody 
Picture Vocabulary Test, an instrument 
that is culturally biased (see Spicker, 
Hodges, & McCandless, 1966), did not 
reach statistical significance (r = .10, p > 
05; n = 181). The corresponding loading 
for Analysis 2 was also at a low level (rm 
.19), although in the direction predicted by 
Jensen. 

Two other loadings are relevant to the 
present discussion: First, higher self-esteem 
(Self-Social Construct Test) js associated 
with higher verbal facility (r = .29). Thus, 
for slum children as well as those more ad- 
vantaged, a positive view of the self is re- 
lated to one's ability to communicate ver- 
bally with others. Second, choosing teacher 
rather than mother, father, or friends is as- 
sociated with this factor (r = .25). Perhaps 
because many slum parents seldom have or 
take the opportunity to converse with their 
children, it is not surprising to find choosing 
teacher associated with verbal facility. 

Verbal facility survived as a factor (see 
Tables 2 and 3) for both males and females 
after the data were sorted by sex. However, 
some individual loadings differed greatly 
between the sexes. Among girls, restraint of 
motor activity and cooperation contributed 
significantly to Factor I (r = 63 and .42, 
respectively). Neither of these variables en- 
tered importantly into Factor I for boys (r 
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= .24 for motor activity; r = .12 for coop- 
eration). On the other hand, good peer rela- 
tionships, low isolation, happiness, choosing 
teacher, and not being silent loaded heav- 
ily on Vector I in male factor space (rs — 
49, 46, .44, 44, and .40, respectively), but 
not in female factor space. However, the 
theoretical ramifications of these between- 
sex differences in factor structure depend 
upon the identification of Factors II and 
IH. 


Factor II 


Because nearly every loading on Factor 
II is a teacher rating weighted in the so- 
cially undesirable direction, it is tempting 
to regard this vector as halo effect. How- 
ever, this explanation has been discarded 
for four reasons: First, since a single com- 
ponent is sufficient to account for substan- 
tial portions of the variance for all scales, 
halo effect cannot account for the splitting 
of teacher ratings into three virtually inde- 
pendent components. Second, the procedure 
used in obtaining ratings discouraged the 
operation of halo effect. Teachers rated all 
children on one scale before moving on to 
the next. Third, it is difficult to see how this 
effect can operate across two independent 
judges to the extent reported by Richards 
(1970). Fourth, most Goldstein Rating 
Scales describe specific-enough child behav- 
ior patterns that untrained observers can 
readily identify them. 

In a current review of the anxiety litera- 
ture, Phillips, Martin, and Meyers (1969) 
provide an extensive listing of antecedents, 
concomitants, and consequences of both sit- 
uational and trait anxiety. The behavioral 
manifestations of these anxiety correlates 
coincide remarkably with every variable 
loaded on Factor II (see Table 1). 

First, Phillips et al. (1969) include “in- 
creased isolation from others" as an impor- 
tant consequence of anxiety. Isolation was 
rated directly by teachers and loaded heav- 
ily on Factor II (r = .83). Poor peer rela- 
tionships, judged by the relative proportion 
of a child's time devoted to solitary rather 
than cooperative play, also loaded on this 
vector (r — .51). 

Second, facial expressions are a useful 
index of anxiety (Phillips et al., 1969, pp. 4, 


27). If so, then teacher ratings of chroni- 
cally “down at the mouth” suggest the pres- 
ence of anxiety (r = .66). 

Third, another major correlate ("some- 
times used interchangeably with anxiety 
...”) is fear. Ratings of fearfulness and the 
related syndrome, “separation problems," 
load on Factor II (r — .62 and .56, respec- 
tively). 

Fourth, extremely anxious children are 
characterized by Phillips et al. (1969) as 
having "reduced responsiveness to the envi- 
ronment." Hence, the observed loadings for 
the silent child (r — .58) and the child who 
is passive in his speech patterns (r — .55). 

Fifth, anxiety indirectly produces “dete- 
rioration in complex intellectual, problem 
solving, achievement, and learning activi- 
ties.” Not being able to learn is one behav- 
ioral indicator of such deterioration (r = 
42). Other more tangential effects include 
teacher ratings of low verbal skills (r = 
.25 )and poor quality of speech (r = .25). 

Finally, Phillips et al. report that contin- 
ued “exposure (especially in early years) to 
inconsistencies, severe restrictions, threats, 
and punishments from the interpersonal en- 
vironment; frustration of dependency and 
other important needs, with coercive con- 
trols over hostility, aggression, etc.; endur- 
ing fears and conflicts [1969, p. 11]" consti- 
tute the primary antecedents of trait anxi- 
ety. Because these antecedents are more 
commonly present in crowded living quart- 
ers (McCandless, 1967), high household 
density can be expected to indirectly con- 
tribute to anxiety. The obtained loading for 
household density as a variable confirms 
this expectation (r = .29). 

Taken together, it is strongly suggested 
by the loadings summarized in Table 1 that 
Factor II is at the very least anxiety re- 
lated. Because teacher experience with the 
subjects spanned almost 8 months, it is rea- 
sonable to assume that their ratings were 
related primarily to “trait anxiety” (Spiel- 
berger, 1966). Factor II was also replicated 
by Analysis 2 (6 = 91). 

When the highest loadings of Factors I 
and II are plotted in two dimensions, it is 
apparent that the angle separating the two 
clusters is less than 90°. Hence, the clusters 
themselves are intercorrelated, “making it 


o 
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impossible to fit both clusters with perpen- 
dicular orthogonal factors [Blalock, 1960, p. 
385].” The two vectors drawn through the 
centers of gravity of the clusters intersect 
at an angle of 51° indicating a factor inter- 
correlation of about .63. However, the first- 
order correlations between individual varia- 
bles across factors are necessarily deflated 
due to specific factor variance, the unrelia- 
bility of the measures involved, and com- 
mon variance associated with other vectors 
(Harman, 1967). Inspection of the first-or- 
der correlation matrix also indicates that 
low Factor I is associated with high Factor 
II. Thus, low-level negative correlations be- 
tween measures associated with verbal 
skills and those related to anxiety are to be 
expected. Especially for younger children, it 
has been reported by most investigators 
that correlations between various concomi- 
tants of verbal facility, particularly intelli- 
gence, and measures of anxiety (Phillips et 
al., 1969; Ruebush, 1963) are low and nega- 
tive in direction. This factor model also fits 
well into the nomothetic network of the pic- 
ture sociometric technique, particularly the 
Like scale, Because Sociometric-Like loads 
primarily on Factor I, only low-level corre- 
lations with Factor II-type variables are 
predicted. Such correlations have been con- 
inn reported in the literature (Moore, 
_ The factor structure broken down by sex 
is also revealing. For girls, the intercorrela- 
tion of the vectors passing through the cen- 
troids of the clusters is lower than for boys 
(r = 57; r = .79 for boys). Thus, Factor II 
is related more to verbal facility for boys 
than for girls. This finding supports the 
more general notion that anxiety-related 
variables are more closely tied to inade- 
quate personality functioning for boys than 
for girls (Phillips et al., 1969). It also repli- 
cates other research relating specific cogni- 
tive variables to tests of anxiety (e.g. Sara- 
Son, 1963). 


Factor IIT 


As with standardized measures of anxi- 
ety, teacher ratings of behavior may not be 
tapping the source trait itself, but the “cop- 
ing tendencies" employed to reduce level of 
anxiety (Phillips et al., 1969). For example, 


all variables of Factor II are interpretable 
as "coping by withdrawal" (e.g., isolating 
oneself, silent, unhappy, fearful, etc.). On 
the other hand, hostile and aggressive be- 
haviors may also be consequences (as well 
as sources) of trait anxiety (Phillips et al., 
1969). Thus, every variable loaded on Fac- 
tor III can be interpreted as a coping mech- 
anism emphasizing attack or high levels of 
reactivity: teacher ratings of aggressive re- 
actions (r = .80) ; provocativeness to teach- 
ers (r = .79) ; disruptiveness (r = .76) ; low 
eooperation (r — .75) ; low restraint of motor 
activity (r = .67) ; hyperactivity (r = .65). 
Factor III was also replicated by Analysis 
2 (9 = .93). 

Although the loadings of Factor III are 
almost identical across the sexes (see Ta- 
bles 2 and 3), the relative positions of the 
variable clusters with respect to verbal 
skills vary markedly. For boys, the centroid 
vectors associated with Factors I and III 
are virtually orthogonal (r — .12). On the 
other hand, for girls these two vectors are 
moderately correlated (r = .50). Hence, 
only “coping by withdrawal” (Factor IT) 
relates meaningfully to verbal skills for 
boys; both coping by withdrawal and “cop- 
ing by aggression” (Factor III) correlate 
moderately with this vector for girls. First 
order correlations indicate that high coping 
by aggression is associated with low verbal 
facility. These sex related differences in fac- 
tor structure can be accounted for in terms 
of sex role socialization (Phillips et al., 
1969). Because it is less acceptable for little 
girls to behave aggressively, the appearance 
of this trait in girls is more likely to indi- 
cate personality maladjustment than it is 
for boys. For both girls and boys, but par- 
ticularly for boys, coping by withdrawal is 
a clear indicator of maladjustment. 


Factor IV 

Factor IV is defined by four loadings: 
Self-Social Construct scores for identifica- 
tion with friends (r = .77); identification 
with teacher (r = .71); identification with 
father (r = .70); and identification with 
mother (r = .69). Because these loadings 
seem to signify low identification or inti- 
macy with people, McCandless (1968) has 
chosen to call this factor “alienation.” 
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Higher scores indicate greater alienation. 
This factor appeared in Analysis 2 as well 
(æ = .87). The variable race is the only 
other loading greater than .25 (r = .26; 
white direction). This vector is virtually or- 
thogonal to all other factors in the study. 

For girls, alienation is defined by low 
Self-Social Construct scores for identifica- 
tion with friends (r = .84), low identifica- 
tion with father (r = 74), and low identifi- 
cation with teacher (r = .70). Unlike the 
matrix including both sexes, low identifica- 
tion with mother only loads moderately on 
this vector (r = .43), and no other variable 
loads above .25. 

For boys, Factor IV is not only defined 
by all four Self-Social Construct identifi- 
cation scores (mother, r — .82; father, r — 
.72; friends, r = .76; teacher, r = .71), but 
in addition two demographie and two be- 
havioral variables also contribute to its 
variance: being white (r — .35); mother 
absent (r = .29) ; poor peer relationships (r 
= 31); and feminine IT test score (r = 

` 28). Thus, for boys, Factor IV is not strictly 
"test specific.” However, it remains to 
be seen what behavioral and psychological 
correlates develop from this vector as the 
child matures. 


Factor V 


Factor V is made up primarily of biologi- 
cal sex (r = .74) and psychological sex 
preference (IT test; r = —.66). Other load- 
ings serve mainly to expand the nomothetic 
network for the Self-Social Constructs Test. 
Their loadings are shown in Table 1. Ob- 
viously, the breakdown by sex destroyed 
this factor. It was replicated by Analysis 2 
(® = .63). 


Factor VI 


Factor VI was only partially replicated 
by Analysis 2 ( = .45), and none of its 
loadings exceeds .50. Three of the variables 
included on this vector are demographic, 
and all of them are in the socially undesira- 
ble direction. These include father absence 
(r = .45), low occupational status (r = 
.89), and mother absence (r = .33). Factor 
VI appears to include the remaining enyi- 
ronmental handicaps that have not already 
been associated with Factors I, II, and III. 


The highest single loading for race (r = 
.28; black direction) also appears with this 
grouping. Low-level behavioral correlates 
include teachers' ratings of appearing un- 
happy (r = .28), Self-Social Construct 
scores showing less realism about one’s own 
size (r = .28), and Self-Social Construct 
Test, choosing mother (r = .28). The latter 
two loadings indicate the “less mature” 
direction for these variables (Long, Hender- 
son, & Ziller, 1970). The chronological age 
loading (r = .37) suggests that the home 
conditions (e.g., father or mother absence) 
are likely to deteriorate as the child be- 
comes older. 

Although Factor VI loadings can be in- 
terpreted for boys and girls separately, use 
of smaller samples to interpret a factor al- 
ready shown to have questionable stability 
is little more than speculation. 


Factor VII 


Factor VIL was neither replicated by 
Analysis 2 ( = .17) nor does it seem to 
make psychological sense. No corresponding 
factors appeared when these data were bro- 
ken down by sex. It is thus suggested that 
Factor VII represents only test specific and 
error variance. 
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COLLEGE ENVIRONMENTS: 


JOHN A. CENTRA* 
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Three methods of assessing the college environment—student per- 
ceptions, student self-reports, and objective institutional data—were 
compared by use of multimethod factor analysis, a new technique 
which removes method variance by focusing on correlations between 
rather than within methods of measurement. A total of 53 college 
variables for 103 institutions were analyzed. The factors derived 
from the multimethod analysis were more specific and more free 
of method variance than those obtained in a principal axis factor 
analysis; the first four factors in both analyses, however, were sim- 
ilar in content, suggesting that these dimensions appear to be valid 
descriptions of how 4-year institutions differ from each other rather 
than differences related to methods of assessment. Convergent and 
discriminant validity for variables measured by more than one method 


were also examined and discussed. 


Among the several methods used in past 
research to assess or describe college envi- 
ronments, three of the most widely known 
are student perceptions, student self-re- 
ports, and published objective institutional 
data. The perceptual approach, a method 
pioneered by Pace and Stern (1958), relies 
on students’ reports of the activities and 
emphases of their institution; of importance 
are the collective student perceptions of the 
general characteristics of their college (see 
also, Pace, 1969; Pace & Stern, 1958). By 
contrast, the student self-report method re- 
quires students to report their personal in- 
volvement in various activities, their indi- 
vidual goals, their demographic-background 
characteristics, and the like; individual stu- 
dent responses are then averaged to repre- 
sent each institution’s score on each item 
(see, for example, Astin, 1968; Warren, 
1966). The third method, objective institu- 
tional data, includes such information as 
the average academic aptitude scores of en- 
rolled students, the faculty-student ratio, 
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enrollment, and college income per student 
(see, for example, Astin, 1965; and Rock, 
Centra, & Linn, 1970). 

There is some research evidence suggest- 
ing that the different methods of assessment 
do not produce exactly the same results, 
and that in particular the environment as 
measured through the perceptual approach 
differs in part from the environment as re- 
vealed by the student self-report method. 
Astin (1968), for example, analyzed the 
student self-report and student perception 
data from 246 institutions and concluded 
that the two methods involve somewhat dif- 
ferent aspects of institutional differences. 

The extent to which the three methods of 
assessing the college environment yield in- 
dependent measures of college variables was 
investigated further in this study. Specifi- 
cally, the degree to which variables within 
a single method have more in common with 
each other than they do with variables from 
other methods was examined. In addition, 
the related question of the convergent and 
discriminant validity for college variables 
measured by more than one method was in- 
vestigated. One purpose of this study was, 
therefore, to gain a better understanding of 
the three methods of assessing institutional 
environments. A second purpose was to in- 
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vestigate the dimensions that differentiate 
among 4-year college environments. These 
dimensions, it was hoped, would be more 
broadly rooted than those based on only 
one method of describing college environ- 


ments. 
METHOD 


A new technique that Jackson (1969) refers 
to as multimethod factor analysis was explored in 
this study in order to examine common variance 
across methods and to investigate the convergent 
and discriminant validity of those institutional 
variables purportedly measured by more than one 
method. Campbell and Fiske (1959) argue that 
methods or tests should converge in their assess- 
ment of the same trait; moreover, if the method 
of assessment is to be considered independent of 
a given trait or variable, it must also show dis- 
criminant validity. Through multitrait-multi- 
method analysis, therefore, a method may be in- 
valid if the variables measured correlate too 
highly with variables with which they are sup- 
posed to differ. 

Because of a number of limitations in evalu- 
ating multitrait-multimethod matrices, Jackson 
(1969) recommends multimethod factor analysis 
as a method of examining convergent and dis- 
criminant validity. He suggests eliminating method 
variance from multitrait-multimethod matrices by 
orthogonalizing the diagonal monomethod mat- 
rices prior to a principal components analysis and 
rotation of axes [1969, p. 39].” Orthogonalization 
is achieved by substituting diagonal values of 
unity for communality estimates and by substi- 
tuting zeros for the correlations between tests 
within a single method of measurement. In addi- 
tion to separating method from trait variance, 
orthogonalization results in a larger number of 
factors than classical factor analysis. 

An illustration of multimethod factor analysis 
and how it was applied in this study of college en- 
ease appears in Table 1. While Jackson has 

cussed the technique in reference to validating 
fire of measuring individual characteristics 
pes onelity traits, for example), methods of meas- 
Bae tnstitutional variables were compared in this 
a me college environments. Thus, in the con- 
fm iene p study, the institutional vari- 
Shige d om student perceptions, published 
WEST ata, and student self-reports were inter- 
i ultimeth deg factor analyzed using Jackson's 
replaci procedure. The procedure involves 
Nus nd monomethod-multivarisble quad- 
im d. e correlation matrix with identity mat- 
liton 3 indicated in Table 1, the original corre- 
with zo, 26 quadrants labeled I are replaced 
Positions. Ta unities are placed in the diagonal 
fore all y BO. doing, all correlation, and there- 

ae arance unique to a single method of 
nt, is removed. 


TABLE 1 
ILLUSTRATION or MULTIMETHOD FACTOR 
ANALYSIS OF INSTITUTIONAL SCORES 
Baszp on Toren MaTHODS OF 
MEASURING THE COLLEGE 
ENVIRONMENT 


Published 
Student | objective paoar 
percep- | institu- |self- 
tons | tional | dea 
data 


Method 


Student perceptions 


Ru (D |R R 
Published objective " i A 


institutional data | Rm Ra (I) | Ru 
Student self-report 
data Ra Rez Ra (D) 


Note.—The original correlation matrix was 
modified by replacing the monomethod-multivari- 
able quadrant with identity matrices (I), with 
unities as diagonal elements and zeros as off- 
diagonal elements. 


Data Sources 


Student perceptions and student self-report 
data were both derived from the Questionnaire on 
Student and College Characteristics (QSCC), an 
instrument developed in 1968 for the purpose of 
gathering information about colleges that appli- 
cants might find useful (Centra, 1968). The ques- 
tionnaire was completed by a representative sam- 
ple of upperclass students at 116 colleges in the 
fall of 1968. For 103 of these institutions, objective 
institutional data from such published sources as 
American Universities and Colleges (Singletary, 
1968) and the Comparative Guide to American 
Colleges and Universities (Cass & Birnbaum, 1968) 
were also available. This group of 103 institutions 
comprised the sample for this study. 

‘A total of 53 college variables were assessed, 27 
of which, based on their popularly assumed mean- 
ing, were measured by at least two methods as 
indicated in Table 2. For these 27 institutional 
variables, therefore, it is possible to examine con- 
vergent and discriminant validity. Strictly speak- 
ing, no 2 of the 53 variables are exactly alike; 
several do, however, attempt to measure the same 
domain and in some instances are assessed by more 
than one of the three methods. The general area 
of student activism, for example, js assessed by 
student perceptions (Activism), and by self-re- 
ported student involvement, in activist organiza- 
tions or civil rights activities. What might be 
termed academic competitiveness or scholarship 
is estimated by three methods: student percep- 
tions (challenge, i.e., Many teachers allow students 
to slip by with less than their best efforts.), stu- 
dent self-reported amount of time spent studying, 
and published institutional data (number of books 
per student, percentage of faculty with a doctorate, 
average academic ability of students), Each of 
these pieces of information, in other words, is 
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TABLE 2 
EXPECTED OVERLAP AMONG 53 VARIABLES FOR THREE METHODS OF ÁSSESSING THE 
COLLEGE ENVIRONMENT 


Lis pns Vei ePi avaible Stade raibi 
Cultural facilities Involvement in art, drama, dance, 
music 
Faculty-student Enrollment; faculty-student ratio 
interaction 
Challenge Books per student; mean SAT scores | Amount of time studying 
Activism Involvement in activist organizations, 
civil rights, international problems 
Religious affiliation Involvement in religious activities 
Percentage of students to graduate or | Expectation of attending graduate or 
professional school professional school 
Nonacademic Existence of fraternities/sororities Involvement in fraternity, sorority or 
emphasis similar group; involvement in school 


spirit activities; involvement in dat- 
` ing and social life 
Lab facilities Involvement in science activities 


generally considered a reflection of the academic clear that the multimethod analysis pro- 
environment of an institution. duced factors more free of method variance 
Feci than the dimensions identified by standard 

factor analysis; this observation is sup- 

_ Results of the multimethod factor analy- ported by both the 6-factor principal axis 

sis appear in Table 3. The 10-factor solu- solution (Table 4) and a 10-factor solution, 
tion may be compared to a principal axis not shown here. Factors 1 and 5, it can be 
analysis with an Equamax rotation to 6 noted, include a high proportion of the vari- 
factors which is included as Table 4. It is ables from a single method. The 10-factor 


v Vir 


TABLE 3 pinnt" 
MULTMETHOD Factor ANatysis or MEASURES OF THE COL 4ONMENT, 
(n = 108 institutions) METHOD, VARIABLES, AND Facror Í //&as* 
Fac- Student j 
3 C ide niti Student self-report data 


——— ae 


% of men enrolled —1.085 Involvement in intramural ath- M 
a ES 
% to graduate or professional Involvement in art E 
-48 I agn in intercollegiate ath- m 
ics T3 
Fi se got perge in sag issues and E 
income per student 146 | Involvement in campus publications -63 
Mean freshman SAT-M "75 | Time spent studying in 
Mean freshman SAT-V 71$ | Involvement in speech and debate  —.5l 
income per student E socioeconomic status E 
ics at) avo venous in civil risbts cime 
% to graduate or professional Involvement in religious activities 54 
-—.01 Een to attend m sebool a 54 
Curriculum flexibility —— .50 | % living in reeidence halla 80. | Involvement in dating and social p 
7 | Nonacademio +61 | Existence of faternities or soror- Involvement in fraternity or sorority -73 
Faculty-student inter- F 64 
Satisfaction with 5 
: ont ad d (none over .23) io td S <61 
one over Religious affiliatioi nized 
10 ib facilities +15 r m Tajo ament A caer S AUR E. 


BE A Wo doo o A principal components analysis with a Varimax rotation wa? 


b A loading of 1.00 or higher, as in this instance, D 
matrices substituted for monomethod correlation IUE De Que fo de Deut ihe ete i of correlations with identity 
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TABLE 4 
Equamax ROTATION or THE Factor ANALYSIS OF 53 COLLEGE VARIABLES* 
Variable Factor loading 
Factor 1 
Student self-report data 
Involvement in intramural athletics .88 
Involvement in intercollegiate athletics .86 
Involvement in individual competitive sports .72 
Involvement in art activities —.66 
Involvement in recreational-outing activities .65 
Involvement in foreign or art —.59 
Involvement in pep rallies and other school spirit activities 54 
Involvement in plays or dramatic productions —.48 
Involvement in fraternity, sorority, or similar group AT 
| Involvement in folk, ballet, or modern dance — 47 
Involvement in poetry or drama readings —.40 
Expectation of attending graduate or professional school Al 
Published objective institutional data 
Percentage of men enrolled .76 
Percentage of students to graduate or professional schools 44 
Factor 2 
Student perceptions 
Nonacademic emphasis —.74 
Faculty-student interaction +60 
Student self-report data 
Involvement in campus issues and student government 67 
Involvement in religious activities +62 
Involvement in campus publications +51 
Involvement in fraternity, sorority, or similar group —.49 
Involvement in vocal music 48 
Involvement in community service +44 
Amount spent on social life and incidentals — 42 
Published o}“eeti-re institutional data 
Existence of | ‘ities or sororities —.68 
Enrollment — * 5i fii —.67 
Percentage of mě: di dled — 48 
Factor 3 e 
Student perceptions 
Challenge -56 
Student self-report data + 
Involvement in speech and debate s es 
Amount of time spent studying i 46 
Family socioeconomic status : 
Published objective institutional data 81 
Average SAT-V score E 
Average SAT-M score 6 
ok of college income peo student k 
umber of facult; student 
Factor 4 p 
Student perceptions 
Activism fai 
Restrictiveness CB 
Curriculum flexibility . 
| Student self-report data eaa 81 
Involvement in student activist organizations 14 
Involvement in civil rights activities d “66 
Involvement in activities focusing on international problems 40 
Involvement in political activities 3 
Factor 5 
Student perceptions i 63 
tural facilities i 
T hain tan A eee OO bee ee 
| 
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TABLE 4—Continued 
Variable Factor loading 
Factor 5—Continued 
Laboratory facilities .55 
Challenge .50 
Curriculum flexibility 48 
Faculty-student interaction 45 
Student self-report data 
Involvement in instrumental music .66 
Involvement in plays or dramatic productions .44 
Involvement in vocal music .43 
Satisfaction with the college 471 
Would recommend college to prospective students .56 
Factor 6 
Student self-report data 
Involvement in dating and social life .78 
Involvement in campus publications .46 
Number of dates 81 
Family socioeconomic status .60 
Hours working part or full time —.50 
Published objective institutional data 
Percentage in residence halls .50 


an = 103 institutions. 


principal axis solution, furthermore, yielded 
one factor in which six of eight perceptual 
measures received salient loadings. 

Orthogonalization has also resulted in 
fairly well-defined factors with no more 
than three variables from any one method. 
In examining the multimethod factors for 
the expected overlap across methods (i.e., 
Table 2 versus Table 3), some convergence 
of methods is evidenced. Following is a dis- 
cussion of each factor from the multi- 
method analysis (Table 3), including com- 
ments on both an interpretation of each di- 
mension and the validity of measures across 
methods. 


Factor 1 


The pattern of correlations for this factor 
suggests a “Female, Cultural versus Male, 
Athletic” dimension. It is similar to the first 
factor of the standard factor analysis ex- 
cept that there are fewer salient self-report 
variables and the student perception varia- 
ble of Cultural Facilities is now part of the 
factor. Only student self-reported involve- 
ment in art activities loads higher than .40 
on the factor; drama, dance, and music 
were expected also to be part of the fac- 
tor (Table 2) but were not. Student percep- 
tions of cultural facilities and their personal 


involvement in a wide variety of cultural 
activities are therefore not alternate ways 
of describing the same institutional varia- 
ble. 


Factor 2 


This second dimension, heavily influenced 
by student enrollment, is somewhat similar 
to the second factor in the standard factor 
analysis. Perceived faculty-student interac- 
tion receives a high loading (.74) and the 
published faculty-student ratio a moderate 
loading. These two variables, along with en- 
rollment, were expected to load on the same 
dimension, indieating validity for the two 
methods of measuring student contact with 
faculty. The fact that the number of fac- 
ulty per student received only a moderate 
loading, however, suggests that the facul- 
ty-student ratio also reflects faculty in- 
volvement in research or nonteaching duties 
and is not necessarily an accurate indica- 
tion of faculty contact with students. 


Factor 8 


As with the third factor of the standard 
factor analysis, this factor also reflects aca- 
demic stimulation. Student-perceived aca- 
demic Challenge, the mean freshman SAT 
Scores, and student self-reported time spent 
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studying are generally considered measures 
of the academic environment; these institu- 
tional variables, each based on a different 
method, received the highest loadings on this 
third factor. The number of library books 
per student and the percentage of faculty 
with doctorates are usually considered ad- 
ditional measures of the academic environ- 
ment but they did not receive salient load- 
ings; their relevance as “academic environ- 
ment" measures critical to students may 
therefore be questionable. 


Factor 4 


Student perceived activism (.95), and 
student self-reported involvement in actiy- 
ist organizations (.62) and civil rights ac- 
tivities (.57) loaded highly, as expected, on 
this fourth factor. The two methods would 
therefore seem to converge in their assess- 
ment of campus political-social activity. 
The number of library books per student 
was the only objective institutional charac- 
teristic with a noticeable loading on this 
dimension. 


Factor & 


The pattern of correlations for this fifth 
factor suggests the highly regulated campus 
which sends few students to graduate 
school; in addition, students at these col- 
leges tend to be involved in religious activi- 
ties. Conversely, colleges at the alternate 
pole tend to be less restrictive, less religious, 
and to send a higher percentage on for fur- 
ther study. Both the percentage of seniors 
to graduate school published by the college 
and the percentage of students who report 
that they expect to attend received salient 
loadings, 

The variables in this factor were spread 
Out over several dimensions in the standard 
factor analysis (Table 4). 


Factor 6 


i The sixth dimension includes one variable 
tom each method: curriculum flexibility, as 
indicated by student perceptions; the per- 


` “entage of students in residence halls (pub- 


lished objective data) ; and student self-re- 
ported involvement in dating and social 
ife. None of these variables were expected 
? overlap with each other and logically 


cannot be construed as convergent and dis- 
criminant validity for a specific institu- 
tional variable. 


Factor 7 


There is little question that this is a fra- 
ternity-sorority dimension. Variables relat- 
ing to fraternity-sorority emphasis from all 
three methods, as expected, loaded on this 
factor: student perceptions of the nonaca- 
demic environment (.61), which is based 
heavily on fraternity-sorority life; the ex- 
istence of fraternities and sororities (.87), 
as revealed in published data; and student 
self-reported involvement in fraternities or 
sororities (.73). The nonacademic percep- 
tual variable received secondary loadings 
on other factors because of the variety of 
items that make up the score. Student self- 
reported involvement in social and school 
spirit activities did not load on this dimen- 
sion as expected. 


Factors 8 and 9 


The variables within each of these two 
factors were not expected to overlap. Stu- 
dent (self-reported) satisfaction with their 
college had, in the standard factor analysis, 
loaded with five of the perceptual variables. 
Only two of the perceptual variables now 
had loadings greater than .40 on the same 
dimension as student satisfaction. Factor 9 
includes religiously affiliated colleges; stu- 
dent involvement in religious activities did 
not load on this factor as expected, how- 
ever. Apparently enrolling at a religiously 
affiliated college does not assure that stu- 
dents will also become deeply involved in 
religious activities. In fact, as suggested by 
Factor 9, they would more likely get in- 
volved in organized politics (.42). 


Factor 10 

Student perceptions of lab facilities and 
their self-reported involvement in science 
activities were expected to be part of the 
same factor as indeed they are in this tenth 
factor. 

Discussion 

Several conclusions regarding the three 
college assessment methods and the result- 
ing college environmental descriptions seem 
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warranted on the basis of the multimethod 
factor analysis. First it is clear that several 
of the factors derived from the multimethod 
analysis are similar to those from the 
standard factor analysis. In particular the 
first four factors from both analyses iden- 
tify essentially the same college dimensions, 
although the multimethod factors include 
variables from each of the three methods 
and tend to be more specific than the fac- 
tors from the standard analysis. In other 
words, in spite of method variance being 
removed, the first four factors remain essen- 
tially the same, arguing it would seem for 
the stability and validity of these factors. 
These four factors—which might be re- 
ferred to as: (a) female, cultural versus 
male, athletic; (b) faculty-student interac- 
tion; (c) academic stimulation; and (d) ac- 
tivism—thus, are not the result of differ- 
ences related to methods of assessment but 
rather reflect valid descriptions of how 4- 
year institutions differ from each other. 
Second, the expected overlap between 
variables (Table 2) did not materialize in 
every case. There were, however, many in- 
stances when convergent and discriminant 
validity for the specific methods and varia- 
bles were evidenced—for example, student 
perceptions of activism and student self-re- 
ports of involvement in activist groups both 
loaded on the same factor. Of the 27 varia- 
bles, 17 overlapped as expected; for these 
variables and for the methods used there- 
fore, convergent and discriminant validity 
has been demonstrated, although replication 
of these results with another sample would 
seem advisable. There were, nevertheless, a 
number of variables for which convergent- 
discriminant validity was not found, and to 
the extent that the classification scheme 
used in this study was reasonable in catego- 
rizing variables that measure the same do- 
main, then each method would seem to tap 
some information not predictably obtained 
by other methods. In general, therefore, 
there are certain kinds of information that 
can be obtained by only one method, even 
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when it appears that two or more methods 
assess the same domain. 

Finally, in addition to illuminating the 
relationships of college environmental vari- 
ables from one method of assessment to 
variables in another method, this study has 
also helped interpret differences among col- 
leges. In particular, the multimethod analy- 
sis has identified environmental constructs 
based on the several methods which appear 
to provide useful ways to describe the dif- 
fering climates among 4-year colleges. 
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DIRECT AND VICARIOUS REINFORCEMENT: 
A NOTE ON PUNISHMENT AND NEGATIVE INSTANCES! 


J. A. CHEYNE? 


University of Waterloo, Ontario, Canada 


An experiment was conducted to compare the effects of direct and vi- 
carious reinforcement (right or wrong) on learning and performance. 
Subjects (second-grade children) either experienced themselves, or ob- 
served a model experience, reward and punishment for selecting one 
word from each of 18 pairs. Subsequent testing indicated that direct 
and vicarious reward were equivalent in terms of correct performance 
and that vicarious punishment was inferior to direct, punishment. 
However, performance following instructions to a subject under vi- 
carious conditions explicitly requiring matching behavior indicated 
that punished responses were recalled at least as well as rewarded re- 
sponses. Implications of the findings, especially in terms of exposure 
of children to undesired behaviors having negative consequences, were 


discussed. 


Since Thorndike’s classic studies of the 
Law of Effect, a continuing interest has 
been maintained in the comparative effects 
of rewards and punishments on learning and 
performance. Of all the research reported 
by Thorndike, little has been so thoroughly 
scrutinized and subjected to replication as 
that comparing the effects on learning of 
tewards and punishments, especially the ef- 
fects of the verbal reinforcers “right” and 
“wrong.” 

Thorndike’s framework (e.g., 1932, 1935) 
presumes that the effects of rewards and 
punishments are direct and automatic and 
that these effects must fall directly upon 
the actor. Recent, interpretations of rein- 
forcement have tended to suggest that the 
effect of rewards and punishments are indi- 
rect and have stressed the informational 
and/or attention evoking properties of rein- 
forcers (e.g, Nuttin & Greenwald, 1968; 
Walters & Parke, 1965). While little re- 
Search has directly concerned itself with the 
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comparison of the effects of direct as op- 
posed to vicarious rewards and punish- 
ments, recent research in the area of imita- 
tion suggests that similar effects may be ob- 
tained for viearious and direct reinforce- 
ment (Walters, 1968). Although some re- 
cent studies have found vicarious reinforce- 
ment somewhat less effective than direct re- 
inforcement (Myers, Travers, & Sanford, 
1965; Van Wagener & Travers, 1963) at 
least one study has found vicarious experi- 
ence more beneficial than direct experience 
(Hillix & Marx, 1960). However, such 
studies have not been concerned with the 
relative effects of positive and negative re- 
inforcers. This lack of interest in the effects 
of vicarious reinforcement seems odd in 
view of the fact that much, if not most, 
classroom learning occurs during question 
and answer periods in which individual stu- 
dent’s responses and their consequences are 
observed by the class. 

Many of the studies that have investi- 
gated the effects of the observation of re- 
sponse consequences to the model have been 
interpreted to suggest that such conse- 
quences have little or no influence on the 
acquisition of imitative responses but 
merely influence performance of such re- 
sponses (Bandura, 1968; Walters, 1968). In 
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an early study supporting this hypothesis, 
direct reward for imitation eliminated dif- 
ferences among groups that had experienced 
vicarious reward, punishment, or no conse- 
quences following the model’s performance 
(Bandura, Ross, & Ross, 1963). Similar 
findings have subsequently been reported 
(Bandura, 1965; Walters & Parke, 1964; 
Walters, Parke, & Cane, 1965). However, 
recent studies by Liebert and Fernandez 
(1969) indicated that direct and vicarious 
reward were additive in their effects; that 
is, vicarious reward apparently had influ- 
enced the acquisition of the modeled behav- 
ior rather than merely the performance of 
the behavior. Liebert and Fernandez sug- 
gest that Bandura’s (Bandura et al., 1963) 
failure to find an acquisition effect from vi- 
carious incentives follows from the simple 
but interesting (ie., aggressive) responses 
to be learned. Similarly, the studies by Wal- 
ters and his collaborators (Walters & Parke, 
1964; Walters et al., 1965) employed situa- 
tions that were such as to minimize the ef- 
fects of both the informational and atten- 
tion-evoking properties of reinforcement. 
The problem of punishment offers some- 
what more difficulty for the theorist than 
does that of reward. On the one hand, pun- 
ishment provides the information for a sub- 
ject that the behavior eliciting punishment 
is not to be performed. On the other hand, 
there is evidence that some forms of direct 
punishment, because of particular atten- 
tion-evoking properties, may facilitate 
learning of the prohibited or wrong behav- 
ior (Cheyne, Goyeche, & Walters, 1969; 
Penney, 1967). However, punishment, to be 
judged effective, must facilitate learning 
and, simultaneously, inhibit that learned 
behavior. Thus, in the case of punishment 
there is a lack of concordance, such as ex- 
ists for reward, between the learning of the 
reinforced behavior and the performance of 
that behavior that may cause punishment 
to be judged less effective than reward on a 
performance measure. Moreover, the diffi- 
culty is compounded when we are dealing 
with vicariously experienced punishment. In 
such a case the incorrect behavior is demon- 
strated to a child by some model who is 
subsequently punished. This procedure is 


not so very different from that employed by 
Hovland & Weiss (1953) in their study of 
relative ineffectiveness of “negative in- 
stances” in concept learning. 

The experimental task used in the present 
study involved a two-choice situation in 
which a subject was required to learn the 
correct item of each of 18 pairs. Training 
and testing were limited to one trial, a pro- 
cedure that simplifies the interpretation of 
the effects of reward and punishment and 
precludes confounding of number and pat- 
terning of reinforcements. A similar modifi- 
cation of Thorndike’s method was devel- 
oped by Martens (1946). The task is simi- 
lar to that used by Liebert and Fernandez 
(1969) except that the present task involves 
verbal stimulus items rather than pictorial 
ones and is presented as a skill rather than 
a preference task. 

On the basis of the foregoing discussion it 
may be predicted that both direct and vi- 
carious reward and punishment influence 
performance such that reward will lead to a 
repetition of original items and punishment 
to the repetition of the alternate response. 
It is further predicted that when subsequent 
instruetions nullify the informational value 
of the incentives additional testing will in- 
dicate that punishment was at least as 
effective as reward in terms of learning and 
retention and more effective than indicated 
by the initial test. 


METHOD 
Subjects 


Subjects were 36 second-grade boys and girls 
for whom parental permission to participate in the 
experiment was obtained? An equal number of 
boys and girls was randomly assigned to one of 
two experimental conditions; direct or vicarious. 


Experimental Arrangements 


Subjects were brought individually to a mobile 
laboratory situated outside the school and, under 
the direct conditions, were seated facing a small 
24 X 24 inch screen or, under the vicarious condi- 
tions, beside another (always the same sex) chile 
who was facing the screen. Under vicarious condi- 
tions, the subject was facing in the same direction 


*The author acknowledges the cooperation E 
the Kitchener and District Public School Boa 
and the principal and staff of Smithson School. 


DIRECT AND VICARIOUS REINFORCEMENT 65 


as the screen but slightly to the left of the screen. 
Located directly behind and 6 inches above the 
subject (under direct conditions) or the model 
(under vicarious conditions) was a Kodak Carousel 
projector which was used to present the stimulus 
materials. 

The stimuli consisted of 18 word pairs (taken 
from & Grade 2 reader) typed on 18 separate 
slides. 


Procedure 


Upon entering the trailer, the subjects were 
seated and given the following instructions: 


Let me explain what we are going to do here 
today. I'm going to show you some words up 
here on the screen. Each time two words will 
appear, one here and one here (the experimenter 
indicated two points on the screen). Each time I 
want you to say one of the words, the one you 
think is the right one, and I'll tell you whether 
you're right or wrong. Under the vicarious con- 
ditions, these instructions were directed to the 
model rather than subject and following the in- 
structions, the experimenter turned to the sub- 
ject and said; You just watch this time. 


The experimenter next stepped behind the sub- 
ject (and the model) and began to operate the 
projector. After each response of the subject (or 
the model) the experimenter merely said right or 
wrong. In a predetermined sequence (with a dif- 
ferent randomization for each subject) the experi- 
menter said right for nine responses and wrong 
for nine. The experimenter recorded all responses 
made by the subject or by the model. 

Following presentation of all 18 slides, the ex- 
perimenter said to the subject, under the direct 
condition, “I’m going to show you those words 
again and I want to see how many you can get 
right now. This time I won't say anything. I'll just 
listen this time.” Under the vicarious condition, 
the experimenter asked the model to return to the 
classroom and asked the subject to sit where the 
model had been seated. The experimenter then 
said, “I’m going to show you those words again 
and I want you to say the one you think is the 
night one. This time I won't say anything. I'll just 
listen this time.” 

The experimenter then presented the 18 slides 
again (Trial 2), once again recording the subject’s 
Tesponses. This time, however, the experimenter 
Temained silent throughout the presentations. 

For subjects under the vicarious condition, fol- 
lowing the second presentation, the experimenter 
said. : "Now, I'm going to show you those words 
again. This time I want you to tell me the words 
(NAME) said.” The experimenter then presented 
the 18 slides again (Trial 3), recording the subject’s 
Tesponses and remaining silent. 

Following these operations, the experimenter 
thanked the subject for visiting the trailer and re- 
turned each subject to the classroom. 


Measures 


From the records of the experimenter, the right 
and wrong items on Trial 1 were determined. With 
this information the number of correct items given 
on Trial 2 could be determined (ie., repeating 
right responses and changing wrong ones). In ad- 
dition, for subjects under the vicarious condition 
the number of matching responses( repeating what 
ae model said) could be determined on Trials 2 
and 3. 


RESULTS 
Direct Versus Vicarious Experience 


For each group the mean number of cor- 
rect responses for right and wrong items on 
the performance trial under direct and vi- 
carious conditions are presented in Table 1. 

Right items were repeated more often 
than wrong items were changed (F = 
7.09, df = 1/34, p < .01). Subjects experi- 
encing direct reinforcement did not differ 
significantly from subjects receiving the vi- 
carious experience. Moreover, type of expe- 
rience did not interact with type of item to 
a reliable degree. The sex variable was not 
significant nor did it interact with the other 
variables to a reliable degree. 


Recall of Vicariously Experienced Material 


The mean numbers of matched responses 
for all groups under vicarious conditions on 
Trials 2 and 3 are presented in Table 2. 
More items were matched on Trial 3 than 
on Trial 2 (F = 8.42, df = 1/17, p < 
01), more right responses were matched 
than wrong responses (F = 5.07, df = 


TABLE 1 
MEANS AND STANDARD DEVIATIONS FOR REPETI- 
TION or Correct Items ror Mates AND FE- 
MALES UNDER DIREOT AND VICARIOUS 
REINFORCEMENT CONDITIONS 


Ttem 
Reni Right Wrong 
M SD M SD 

Direct 

Males 6.0 1.09 5.6 1.11 

Females 6.3 .84 5.6 .86 
Vicarious 

Males 6.1 1.05 | 4.4 .93 

Females 6.2 -97 4.5 199 
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TABLE 2 
MEANS AND STANDARD DEVIATIONS FOR MALES 
AND FEMALES UNDER Vicarious CONDITIONS 
ON PERFORMANCE AND RECALL TRIALS 


| Item 
Trial Right Wrong 
M SD M SD 

Performance 

Males 6.1 .94 4.7 .99 

Females 6.2 1.01 4.6 1.03 
Recall 

Males 6.7 -99 6.8 1.14 

Females 6.9 .89 6.9 .93 


1/17, p < .04), and type of item interacted 
with trials (F = 13.17, df = 1/17, p < 
002). The interaction resulted from the 
fact that many fewer wrong responses were 
matched than right responses on Trial 2, 
but slightly more wrong responses were 
matched than right responses on Trial 3. 
Again the sex variable did not produce a 
reliable effect and did not interact with the 
other variables to a reliable degree. 


Discussion 


Analysis of the number of correct re- 
sponses on Trial 2 reveals two interesting 
effects. First, direct reinforcement did not 
differ significantly from vicarious reinforce- 
ment and second, right items were repeated 
more than wrong items were changed. How- 
ever, the effects of direct and vicarious ex- 
perience may not be so congruent as the 
main analysis suggests. It is apparent from 
viewing the means in Table 1 and a selected 
comparison by means of orthogonal weight- 
ing coefficients that vicarious punishment 
produced fewer correct responses on Trial 2 
than did direct punishment (F = 4.20, df 
= 1/34, p < .05). In fact, subjects under 
the vicarious condition tended to match the 
model’s punished responses at slightly 
greater greater than chance level. Thus, sub- 
jects under the vicarious conditions appear 
not to have benefited from the model’s ex- 
perience on wrong items. That this presump- 
tion is not completely correct is evident 
when one compares the matching behavior 


of subjects under vicarious conditions on 
Trial 2 and 3. These subjects slightly more 
of the model’s wrong responses than his right 
responses. 

There are a number of conclusions that 
can be derived from these findings. First, 
the effects of vicarious and direct rewards 
are highly similar in terms of performance 
of the rewarded responses, while the effects 
of punishment are less clear with a sugges- 
tion that direct punishment leads to correct 
alternatives more than vicarious punish- 
ment. Second, the effects of vicarious re- 
wards and punishments are very nearly 
equal in terms of learning and retention of 
the model’s responses. 

The first of these findings may be ex- 
plained in a number of ways. Perhaps there 
exists already in subjects a propensity to 
emulate the model even in the absence of 
response consequences to the model. Such a 
tendency would tend to militate against a 
marked suppression of wrong items and ex- 
plain the difference between direct and vi- 
carious punishment. However, this explana- 
tion is rendered unlikely since the tendency 
to match the model would work for the 
benefit of reward as well as working for the 
detriment of punishment. However, there is 
no indication that vicarious reward is more 
effective than direct reward. 

An alternate explanation for the present 
finding is that there is differential forgetting 
of consequences. That is, although the data 
from Trial 3 indicate that right and wrong 
items are equally remembered, it is possible 
that wrong consequences were forgotten 
more often than right consequences. In such 
a case more wrong items would be repeated 
than right items resulting in fewer correct 
responses on Trial 2 for wrong items. Such 
an interpretation has a number of features 
in common with recent suggestions by 
Buchwald (1969) and Greenwald (1970). 
While this possibility explains the differ- 
ence between number of correct right and 
wrong items it does not explain why the 
difference is greater under vicarious than 
under direct conditions. x 

A third hypothesis that is not incompati- 
ble with the second explains the difference 
between direct and vicarious conditions: On 
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a given pair, a subject may or may not 
remember which item was selected initially 

(either by the model of himself). Given 
that the response is recalled, the subject 
may know that the response is right; he 
may know that it is wrong; or he may be 
uncertain as to whether it is right or wrong. 
This reasoning is consonant with results re- 
ported by Nuttin (1947, 1949) indicating un- 
certainty on the part of a subject as to the 
locus of the assigned right and wrong. In 
the first case he will match the previous 
response, in the second he will switch to the 
alternate response. Given the evidence for 
the nearly equal learning of right and 
wrong items (recall data) one might expect 
that both reward and punishment should 
produce equivalent performances on Trial 
2. Since this was not the ease, one might 
infer that in the third ease (of uncertainty), 
Vicarious subjects tended to match the mod- 
el's behavior, increasing matching of both 
right and wrong items. Such an interpreta- 
tion is concordant with the findings of Wal- 
ters and Amoroso (1967) that uncertainty 
increased the incidence of imitative behav- 
ior. Moreover, the difference between direct 
and vicarious subjects for wrong items sug- 
gests that subjects tend to match the mod- 
el’s wrong items more than they repeated 
their own errors. 

The third hypothesis has interesting im- 
plications inasmuch as it predicts even 
greater matching of negative behaviors as 
time passes because of forgetting and hence 
Increasing uncertainty regarding conse- 
quences. Studies recently completed bear 
out this prediction. This reasoning can ex- 
plain the relatively greater decrement in vi- 
cariously learned materials over time rela- 
tive to materials learned through direct ex- 
perience. 

_ The current study has numerous implica- 
tions for theories of instruction as well as 
for the study of the vicarious transmission of 
Social behaviors (desirable and undersira- 
ble). Particularly intriguing are the data 
Suggesting that a child may remember ac- 
tions or words that are labeled as wrong as 
Well as, or better than, those labeled as right, 
especially when these data are considered in 
conjunction with the finding that a child, 


while he may remember what was done, may 
forget whether it was right or wrong, and 
presume that wrong deeds were right (if he 
has seen them performed by someone else). 
This is, of course, potentially the case in 
everyday classroom experiences of the child 
when he hears fellow students responding to 
questions and receiving feedback about their 
correctness. Depending upon the proficiency 
of his classmates he is being exposed to more 
or less incorrect information that he will 
shortly dissociate from the consequence 
that promoted its acquisition. He will, how- 
ever, have added the response to his reper- 
tory and he is prepared to emit that re- 
sponse under the appropriate conditions, 
perhaps especially because he recalls that 
“Johnny once said it that way.” 

Finally, it may be argued from Bandura’s 
findings (Bandura, 1965; Bandura et al., 
1963) that the effects of rewards and pun- 
ishments observed in the present study will 
not be obtained for many behaviors that 
are intrinsically interesting (such as violent 
or agressive behaviors). While this may be 
true, many actions which may be initially 
interesting are presented so frequently (e.g., 
television) as to lose the intrinsic ability to 
evoke attention and hence subtle variations 
of performance might be missed in the ab- 
sence of consequences to the models dis- 
playing the behavior. 

In conclusion, one cannot help but reflect 
on the paradox of exposing children to un- 
desired behavior that is vieariously (or di- 
rectly) punished. While punishment does 
serve as an indication of the prohibited sta- 
tus of behavior it also serves as an event 
that promotes the learning of the forbidden 
response. Given this effective learning the 
child is then equipped, when sanctions be- 
come attenuated, to perform those actions 
with equal proficiency to those that have 
been fostered by positive means. 
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SOCIAL ENVIRONMENT AND INDIVIDUAL LEARNING: 
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A TEST OF THE BLOOM MODEL 


HERBERT J. WALBERG! 
University of Illinois at Chicago Circle 


According to Bloom’s literature analysis, the correlation of measures 
on the same characteristic at two points in time will approach unity 
when the relevant, intervening environment is added to the prediction 
equation. This hypothesis was tested on random subsamples, with a 
minimum size of 715, of a large national sample of about 3,700 high 
school physics students. Simple, multiple, and canonical correlations 
showed that the environment scales predict cognitive and behavioral 
posttests. Multiple, multivariate, ordered, stepwise regression re- 
vealed that canonical variates derived from environment scales con- 
tribute a small but significant percentage of variance accounted for 
in the posttests after entering pretests and IQ in the equations. The 
range of multiple correlations for the complete regression model, cor- 


rected for criterion unreliability, was .87 to .96. 


The locus of interest in educational meas- 
_ urement is beginning to shift from measures 
Of the individual to measures of the envi- 
ronment, While individual measures have 
been effectively used as predictors and cri- 
teria and for selection and placement, envi- 
Tonmental assessments may make it possi- 
ble (a) to improve the accuracy of predict- 
ing learning and (b) to manipulate the en- 
vironment to bring about optimal condi- 
tions of learning (Walberg, 1969). One pur- 
pose of the present study is to replicate the 
general relationship between environment 
and learning found in a previous series of 
‘studies of high school physics classes (Wal- 
berg & Anderson, 1968). A second purpose 
1810 probe Bloom's (1964) model: 


L= I, + f(E), where I represents quantitative 
Measures of a characteristic at two points in time 
E E represents the relevant environmental char- 
acteristics during the intervening period [p. vi]. 


Bloom formulated the following hypothesis: 


The correlation between measurements on the 
‘ame characteristics at two different times will ap- 
Proach unity when the environment in which the 
dividuals have lived during the intervening pe- 
Nod is known and taken into consideration [p. 192]. 


"Although the seminal assessment studies 
a 


1 Requests for reprints should be sent to Herbert 
„Walberg, College of Education, University of 
ois, Box 4348, Chicago, Illinois 60608. 


of Pace and Stern (1958) on college envi- 
ronments and Wolf (1964) on home envi- 
ronments began important chains of predic- 
tion research, subsequent work was not ad- 
dressed directly to testing Bloom’s hy- 
pothėses; nor did it permit a test since 
measures on the individual at two points in 
time plus a measure of the relevant inter- 
vening environment were not all collected. 
In contrast, the evaluation design of Har- 
vard Project Physics (HPP) called for pre- 
and posttesting on a number of cognitive, 
affective, and behavorial criteria during the 
first and last 2 weeks of the academic year 
as well as assessments of student percep- 
tions of classroom environments at mid- 
year (Walberg & Welch, 1967). The first 
series of studies on a national sample of 
high school classes showed the incremental 
validity (Cronbach & Gleser, 1957) of the 
environmental assessments; that is, with 
both individuals and classes as the units of 
analysis (Anderson & Walberg, 1968; Wal- 
berg & Anderson, 1968), environment added 
significantly to the prediction of a given 
posttest beyond that predicted by the corre- 
sponding pretest. However, it has since been 
argued in general that a more rigorous test 
of incremental validity would adjust each 
posttest for several relevant pretests before 
adding hypothesized measures to regression 
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models (Cronbach & Furby, 1970; Walberg, 
1971). Thus, this analysis was used in a 
study of a national random sample of HPP 
classes (Walberg, 1969) and in the present 
study of individuals. 


MxrHoD 


Instruments 


The rationale and development of the Learning 
Environment Inventory (LEI) have been de- 
scribed elsewhere (Walberg, 1969). Briefly, it con- 
sists of 14 dimensions, each having seven items de- 
scribing a class. On a 4-point scale, the student is 
asked to indicate his agreement or disagreement 
that the item describes his class. The internal con- 
sistencies (intraclass correlations for individuals) 
range from .58 to 86 and are shown in Table 1. 

Four learning criteria were selected from a 
larger battery; the internal consistencies are given 
in Table 2 (all Kuder-Richardson Formula 20 ex- 
cept for Activities which is the intraclass correla- 
tion). The first, the Test on Understanding Science 
is a 60-item, multiple-choice test on the nature, 
processes, and goals of science (Cooley & Klopfer, 
1961). The Physics Achievement Test is a 36-item, 
multiple-choice test of general physics knowledge. 
Physical science interest was tapped with a sub- 
seale of the Academic Interest Measure (Halpern, 
1965). And voluntary participation in physics ac- 
tivities during the past year was with a 
subtotal of items having to do with physics on the 
Pupil Activity Inventory (Cooley & Reed, 1961). 


WALBERG 


As an additional control variable, the Henmon- 
Nelson Test of Mental Ability was selected; its 
K-R 20 reliability is 91 (Lamke, Nelson, & Kelso, 
1960). 


Sample and Analysis 


A simple random sample of 56 teachers was se- 
lected from the National Science Teachers Asso- 
ciation’s list of 17,000 physics teachers in the na- 
tion; another national sample of 19 teachers was 
included who had previously taught the course 
since the IQs of their students did not differ from 
the random sample (116 and 115, respectively). 
The 75 teachers (trained in testing the prior sum- 
mer) administered the instruments to about 3,700 
students in 144 classes. Using "random data col- 
lection” (Walberg & Welch, 1967), about half the 
students in each class took each test; and about a 
fourth took any combination of two tests. Be- 
cause of absences and some data lost in the mail, 
the smallest number of cases for any correlation 
was 715. 

In the first analysis, the simple, multiple, and 
canonical correlations of the LEI scales and the 
posttests were computed. The second analysis em- 
ployed multiple, multivariate, ordered, stepwise 
regression analysis; the four posttests were simul- 
taneously regressed on the four corresponding pre- 
tests, IQ, and the two environmental canonical 
variates calculated. during the first. analysis. Both 
the beta weights and the contributions to ex- 
plained variance by each predictor were tested for 
significance (using the .05 level) in a multivariate 
as well as a univariate sense. 


TABLE 1 


CORRELATIONS BETWEEN ENVIRONMENT ScALES AND POSTTEST CRITERIA FOR ABOUT 715 INDIVIDUALS 
AND 144 CLASSES 


Understanding Achievement Interest Activities 
Environment Scales 

Individual Class Individual Class Individual Class Individual| Class 
Intimacy (78) 01 —01 —05 —08 00 08 09* 05 
Friction (78) —12** |—19* | —10** | —14 —02 —30*** | —02 -u 
Cliqueness (74) —06 -11 —03 —10 =07 Lope | —05 —19* 
Satisfaction (80) 01 02 03 00 16*** 33%** 14*** 17* 
Speed (77) 00 14 —05 14 01 —10 04 —07 
Difficulty (66) 12** 43*** | 06 40*** | 03 13 01 04 
Apathy (83) -05 -n 01 02 —12* | —4o*** | —15*** | —23** 
Favoritism (77) —18*** | —16 —11** | —10 00 ~17* 05 00 
Formality (64) —04 —09 —18*** | —08 08* 12 05 —05 
Direction (86) —06 —15 —1]2** | —18* 05 17* 03 -0 
Democracy (67) 06 04 —02 —06 03 17* 01 ol 
Disorganization (81) ~07 04 02 03 —08* | —20* | —05 03 
Diversity (58) —01 —08 -04 —05 —06 —03 —02 —06 
Environment (65) i | 18* 03 04 04 09 Q4 «1407080 
Multiple R 29*** 63*** 30*** 5gtt* 22 49*** 25* 43 


Note.—Decimals omitted; read in hundredths. Internal consistencies (intraclass correlations) of the 


environment scales are given in parentheses. 
*p = 05. 
**p- 01. 
*** 5 = .001. 
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RESULTS 


The simple and multiple correlations of 
the LEI scales with the posttests are shown 
in Table 1. Also, the same correlations with 
classes as the units of analysis from the 
previous study (Walberg, 1969) are shown. 
Although the significant correlations do not 
differ in sign, they do differ considerably in 
magnitude: generally the correlations for 
classes are higher; and neither is an accu- 
rate estimate of the other. Except for Inter- 
est, the multiple correlations are all signifi- 


Achievement 


Interest 


o Activities 


Code 
Oo Criterion 
x Predictor 


x Favoritism ]-.5 


cant. Rather than examining the many sig- 
nificant simple correlations, the covariate 
structure of the two batteries is perhaps 
best seen in canonical plots in Figure 1. 
There were two significant canonical corre- 
lations between the two batteries: .35 and 
25 (p > .001 and .05, respectively). The 
loadings on the first environmental variate 
indicate that high scores on the cognitive 
tests and low scores on the affective tests 
are associated with high ratings on Diffi- 
culty and Apathy and low ratings on Fa- 
voritism, Formality, Direction, Friction, 


3 


o Understanding Difficulty 


x Apathy 


+2 


x Diversity 


Cliqueness 


x Speed 
xNntimacy 


Friction 


Direction 


Formality x 


Fic. 1. Canonical loadings on two environment variates. 
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and Satisfaction; on the second variate, low 
scores on Achievement, Interest, and Activ- 
ities are associated with low ratings on Sat- 
isfaction and high ratings on Formality, 
Direction, and Diversity. A more interpret- 
able solution results when the first variate 
is rotated (Walberg, 1971) orthogonally 
through Achievement. This solution sug- 
gests that high scores on the cognitive tests 
are associated with low ratings on Formal- 
ity, Favoritism, Direction, Friction, and 
Speed; and on the second rotated variate, 
high scores on the affective tests are asso- 
ciated with high ratings on Satisfaction and 
Favoritism and low ratings on Apathy and 
Difficulty. 

For a direct test of Bloom’s hypotheses, 
the two unrotated environment variates 
were entered last in the regression models 
shown in Table 2. It should be noted that 
the regression analysis was calculated on 
the correlation matrix using 715 degrees of 
freedom, the lowest number of cases for a 
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simple correlation. The last row of fig. 
ures, chi-square approximations to Wilk’s 
lambda, show that in a multivariate sense, 
all seven predictors, including the two envi- 
ronment variates, contribute significantly to 
prediction of the posttest battery. In a uni- 
variate sense, the environment variates con- 
tribute to the prediction of three of the four 
posttests after the other five predictors 
enter the regression model. The amounts of 
explained variance contributed by IQ and 
the sum of the two variates, respectively, 
are: for Understanding, 2.88 and .98; 
Achievement, 3.24 and 2.14; Interest, .02 
and .04; and Activities, .24 and .81. It 
should be noted that in the case of multico- 
linearity (correlated predictors) as in the 
present data, tests of variables entering the 
model last are more stringent; thus, the 
small but significant incremental validity of 
both IQ and environment are tentatively es- 
tablished by the analysis. 

Finally, it may be noted that the multi- 


TABLE 2 
MULTIVARIATE AND UNIVARIATE REGRESSION ANALYSIS 
Predictors: Pretests, IQ, and Environment 
Criteria: Posttests 7 R Ri 
aa | Achieve | Interest | Activities | IQ | Verate | Varlate 
Understanding (.76) 
r .64***|  .50***|  .16***| .90***| .54***| .94***| .02 .69*** | .87 
B .42**|  .11* .04 .08* .20*** ,08* .06 
RS 40.70***| 2.33***| — .43 -26 |2.88***| .60* .98 
Achievement (.77) 
r .52*** .69*** .2o*** .21***| .54***| .og***|— 19***| .75*** .88 
B .06 .48***! — .0o* .02 .29***| .11** |—.10** 
Ra? 27.46***| 22.47***| 1.06** .00 | 3.24*** 1.08** | 1.06** 
Interest (.91) 
" .l5*** 20%** .68*** .49***| .14***|.09* |—.13*** .70*** | .96 
B .06 .00 .59***|  .13** | .03 |—.06 |—.03 
Rè 2.10** | 2.66***| 42.10***| 1.84** | .02 .29 .12 
Activities (.80) 
T .09* .24***| agas) — 4*9 05 — |—.14**9|—.13***. .76*** | .90 
B | —.07 Hover 07: .67***|—.05  |—.07* |—.06 
RS 74 5.41***| 18.90***| 31.80***| .24 .46* .35 
Multivariate 
2 239*** 107*** i246*** 245*** 40s 16** 22*** 


Note.—The abbreviations in the table are as follows: r is the simple correlation of predictors and cri- 
teria; B is the standardized partial regression weight; R,? is the increment in variance accounted for 
by the addition of each predictor to the univariate regression model; X? is the approximation to Wilk's 
lambda for the significance of additional variance in the criterion battery (in a multivariate sense) bY 


each predictor. The last two columns are multiple 


correlations uncorrected and corrected for unte 


liability of the criterion (internal consistency reliabilities given in parentheses). 


*p = .05. 
*p = 01. 
*** 5 = 001. 
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ple correlations (corrected for unreliability 
of the criteria) in the last column of Table 
2 range between .87 and .96. These average 
90 and leave 19% of the variance unex- 
plained (1 — R?). 

Discussion 


Bloom’s hypothesis survives the fairly 
stringent present probe using classroom 
data. The multiple correlations of individ- 
ual measurements at two points in time do 
approach unity when the intervening envi- 
ronment is accounted for in the prediction 
equation. Shrinkage attributable to sam- 
pling error is not likely to be great with the 
large number of cases in the present study. 
However, attempts at replication in other 
subject areas and grade levels may result in 
an entirely different pattern of correlations 
between environmental assessments and 
learning criteria. 
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RELATIONSHIP OF DISCRETE CLASSROOM BEHAVIORS 
TO FOURTH-GRADE ACADEMIC ACHIEVEMENT' 


JOSEPH A. COBB’ 
Oregon Research Institute and University of Oregon 


The prediction of academic achievement from rates of specific task- 
oriented and non-task-oriented behaviors was investigated. Students 
were observed in two schools for 9 days during arithmetic periods. 
For each school, multiple regression equations were generated using 
rates of &pecific behaviors as independent variables and standardized 
achievement scores as dependent variables. The final multiple Rs for 
predicting arithmetic achievement were .69 for one school and 63 for 
the other school. Cross-validation procedures resulted in correlations 
of 58 and .50. Final multiple Rs for predicting reading and spelling 
achievement from the arithmetic observational data provided mod- 
erate multiple Rs of .66 and 50. On cross-validation, one correlation 


was maintained. The implications of the 
children achieve academically were discussed 


This report describes a new method of 
predicting academic achievement which has 
been developed during the past few years. 
Rather than concentrating on possible in- 
ternal mediating variables measured and 
inferred by paper-and-pencil tests or by 
teachers’ ratings, emphasis has been placed 
on the child’s overt classroom behaviors as 
coded by impartial observers. The tech- 
nique is promising, since results provide an 
empirical basis for theoretical formulations 
concerning academic achievement correlates 
as well as suggesting possible intervention 
strategies to increase achievement levels. 

Meyers, Attwell, and Orpet (1968) re- 
ported results of a follow-up study of 57 
fifth-graders who had been tested in kinder- 
garten. Both behavioral ratings and scores 
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which led to the present study. Requests for re- 
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CORBEH, Department of Special Education, 1662 
Columbia, University of Oregon, Eugene, Oregon 
97403. 
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findings in helping young 


from 13 ability tests were originally ob- 
tained. Five years later the children's acas 
demic achievement was measured by six 
California Test of Achievement (CTA) 
subtests. The behavioral ratings and. test 
scores were used as independent variables in 
& stepwise multiple regression analysis for 
each CTA subtests. The behavioral rating, 
“attention,” emerged as the first and most 
powerful predictor for three criterion sub- 
tests and as fourth and fifth predictor for 
two other subtests. The average correlation 
between all six of the subtests and ratings 
on attention over the 5-year period was .36. 
Similarly, Lahaderne (1968) obtained cor- 
relations of 39 to .51 between observed 
rates of attention and various achievement 
measures of 125 sixth-graders. 

While these studies produced results 
highly suggestive of the importance of 
classroom behaviors to academic achieve- 
ment, several questions remain to be an- 
swered. The present investigation at- 
tempted to answer three major questions. 
First, are behaviors more specific than the 
general class of attention related to aca- 
demic achievement? By breaking down the 
work-oriented category to more discrete be- 
haviors, evidence might be available to 
teachers, counselors, and behavioral engi- 
neers of the specific behaviors which should 
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be reinforced in the classroom to maximize 
achievement. 
A second question of theoretical and 


‘practical significance relates to the general- 


izability of behaviors across settings. Do 
specific behaviors observed in one academic 
setting relate to achievement in other aca- 
demic areas (e.g., if rates of some behay- 
ioral class are obtained in arithmetic, do 
these rates correlate with reading achieve- 
ment?). In the two previous studies, the 
children were observed in all academic set- 
tings and gross scores were used to predict 
achievement in various areas. It was hy- 
pothesized that behavioral rates obtained in 
an arithmetic setting would produce higher 
correlations with arithmetic achievement 
than with reading and spelling achievement. 

A third question concerns replication. By 
applying cross-validation procedures, would 
results from one sample be replicated on 
another sample? Rarely in psychological 
studies of noncognitive correlates of aca- 
demic achievement have investigators used 
a replication design (Cobb, 1969). By 
building into the research design a check on 
the specificity of the results, the investiga- 
tor can provide strong evidence for the 
study’s generality. 

The present investigation concentrated on 
one particular subject area and one popula- 
tion. Fourth-graders were observed in five 
classrooms in two middle-class schools dur- 
ing arithmetic periods. Observational data 
were collected on all children during all 
phases of arithmetic, for example, teacher 
lecturing, desk work, etc. The observational 
data consisted of several categories of 
task-oriented and non-task-oriented behav- 
iors. One week after the study the children 
were tested on standardized arithmetic and 
reading tests. Using the standardized 
achievement tests as dependent variables, 
and rates of observed behavior as independ- 
ent variables, multiple regression equations 
Were obtained; the beta weights obtained 
from each school were then used to generate 
predicted achievement scores in the other 
school. Correlations were then run between 
Predicted and actual achievement to deter- 
mine the generalizability of the findings. 


METHOD 

Subjects 

The subjects attended two elementary schools 
in a district composed of 1,596 fourth-graders. 
These two schools were selected as representative 
of the district on the basis of average IQ and 
achievement levels. Only subjects for whom full 
test data were available were included for data 
analyses. From an original sample of 120 fourth- 
graders, 46 males and 57 females were available. 
School A contributed 60 pupils, 34 females and 26 
males; School B contributed 43 pupils, 23 females 
and 20 males. 


Observations 


Observer reliability. Seven professionally trained 
observers coded the children’s behaviors. They 
were trained in four 1-hour sessions using a tele- 
vision tape of children working in an academic 
setting. During training, observer reliability was 
calculated by the percentage method. The ob- 
server had to agree by code category, as well as 
subject and sequence, with a master sheet which 
had been previously coded by the trainer. The 
number of agreements was then divided by the 
total possible number of agreements to arrive at 
the reliability for each observer. In the fourth and 
final television session the average reliability was 
85%. 

‘Actual classroom observations occurred during 
arithmetic classes in three School A classrooms 
and two School B classrooms. Each child was ob- 
served for a 10-second interval. Then the observer 
watched the next child for 10 seconds until every 
child in the classroom had been coded; then the 
sequence began again. Observers coded behavior 
throughout the class period, and collected 9 con- 
secutive days of data, The observers did not 
know the children’s names, since they were as- 
signed numbers according to & seating plan made 
by the teacher before the observational phase of 
the study began. Additionally, the observers were 
unaware of ability or achievement levels of indi- 


"Two observers were assigned to each classroom; 
one observer coded on one day; the other coded 
on the next day, except when reliability data were 
obtained, at which time both observers coded be- 
haviors simultaneously, but independently. The 
interobserver eee hiv M Je by 
the percentage method and .93 by the Pearson 
product-moment correlation. (see Table 1). 

Stability of data estimation. For an observa- 
tional procedure to be useful as an assessment 
procedure, it is necessary to determine the mini- 
mum amount of data to be collected to provide 
a stable estimate of a person's behavior over time 
(Moreno, 1967; Nixon, 1966; Werry & Quay, 
1968). The general procedure has been to use a 
method similar to that used by test constructors in 
item analysis. Data collected on one day are com 
pared to data collected on another day, which is 
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TABLE 1 
MEAN INTEROBSERYER RELIABILITY COEFFICIENTS AND STABILITY COEFFICIENTS CORRECTED By 
SPEARMAN-BROWN PROPHECY FORMULA ror ENTIRE SAMPLE BY 
SPECIFIC BEHAVIORAL CATEGORIES 


Behavioral categories | AT | TTP | TPP | VO IT CO | SS | OC | PL | TTN | TPN| NC | LO | NA | X 
Reliability coeffi- | .83 | .75 | .93 | .93 | .99 | .99 | .88 | .97 |1.00| * | .97 |1.00| .87 | .95 | .93 
cients 
Stability coeffici- | .83 | .39 | .73 | .60 | .43 | .56 | .68 | .77 e| .45 | .28 |—.04| .52 | .69 | .56 
ents 


a Insufficient amount of entry to calculate. 


analogous to odd-even reliability. For this analy- 
sis à total of 5 minutes of frequency data collected 
on days 2, 4, 6, and 8 was compared to 6 minutes 
and 10 seconds of data collected on Days 1, 3, 5, 
7, and 9 (see Table 1). The average correlation 
across codes was 56. 

The difficulty with using the Pearson product- 
moment correlation to compute the stability co- 
efficients was the number of zero entries in sev- 
eral behavioral categories. When very few indi- 
viduals exhibited a certain behavior out of the 
entire sample for 9 days, a small change in rank- 
ing had a spurious effect on the obtained correla- 
tion. To test out the possibility that low base- 
rate events were resulting in a distortion of the 
data, a rank-order correlation was obtained be- 
tween. the average rate of behavior and the 
stability coefficient for each behavior category. 
The obtained coefficient was .64, significant at the 
05 level. The six behavioral categories that con- 
sistently contributed the greater percentage of 
variance in a regression analysis had a mean sta- 
bility coefficient of .71. These results suggest that 
longer samples of low base-rate behavior need to 
be collected in order to obtain stability coeffi- 
cients of higher magnitude than those obtained in 
11 minutes of observation. 

After the reliability data were analyzed using 
frequencies of occurrence of each behavior, rates 
were obtained for each subject by totalling each 
frequency count across days and dividing by the 
number of minutes of observation. The mean num- 
ber of minutes of data for each subject was 14, 
with a range of from 12 to 24. The reason for the 
wide range was the unequal distribution of stu- 
dents in the classes, for example, only 11 students 
were present in the class where 24 minutes of 
data were collected. 

Coding system: The codes used in the analysis 
of the data were the following: 


AT—Attention. Pupil is doing what is appro- 
priate in an academic situation, e.g. he is look- 
ing at the teacher when she is presenting mate- 
rial; he writes answers to arithmetic problems; 
during recitation he looks at other students who 
are reciting. Category is only used when other 
work-oriented categories are not applicable. 
TTP—Talk-to-teacher-positive. The pupil talks 
to the teacher about academic material. 


TPP—Talk-to-peer-positive. The pupil talks to 
another student about academic material. 
VO—Volunteers. Pupil indicates he wants to 
make an academic contribution, e.g., the teacher 
asks a question and the pupil raises his hand. 
IT—Initiation-to-teacher. Pupil indicates he 
wants some assistance in academic work, e.g., he 
goes to teacher’s desk during independent study 
and asks for help on an arithmetic problem. 
CO—Compliance. Pupil does what teacher re- 
quests, eg., she asks class to take out note- 
books and pupil does; she asks for papers to be 
turned in and pupil obeys. 

SS—Self-stimulation. Pupil stimulates himself 
in such ways as scratching himself, rubbing 8 
pencil back and forth on his desk, feeling the 
material in his clothing, to such an extent that 
he is not paying attention to the assignment. 
OC—Out-of-chair. Pupil is out of his chair and 
not engaging in academic activities, eg, he 
goes to teacher for assistance but she is busy 
with another pupil; he walks around the room. 
PL—Play. Pupil is playing with another pupil, 
eg, playing tic-tac-toe while the teacher 1 
presenting material to the class. 
TTN—Inappropriate-talk-to-teacher. Pupil talks 


about nonacademic material. 
TPN—Inappropriate-talk-to-peer. Pupil talks 
about nonacademic material. 
NC—Noncompliance. Pupil does not do what 


is requested by the teacher. 

LO—Looking around. Pupil is looking around 
the room, out the window, or staring into space. 
NA—Not attending. Pupil is not attending to 
the assignment and no other category is 8p- 
propriate. 


Achievement Tests 


The Stanford Achievement Test (SAT) we 
administered. For each student two means were 
computed; the first consisted of the SAT sub- 
tests, Word Meaning, Paragraph Meaning, and 
Spelling; and the second, Arithmetic Concepts, 
Arithmetic Applications, and Arithmetic Compu 
tation. 


Statistical Analysis 


Scores obtained on the eight observable be- 
haviors that had stability coefficients above 50 
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and scores on the two achievement means were 
subjected to analysis by means of the BMDO2R 
Stepwise Regression Program at the Health Sci- 
ence Computing Facility, UCLA. The program is 
designed to select one independent variable at 
a time, the one that provides the greatest con- 
tribution in accounting for the variance of the 
dependent variables. Cross-validation procedures 
were accomplished by applying the regression 
equations obtained for each school in the step- 
wise regression programs to the other school. The 
first three independent variables were included 
for cross validation. The resulting predicted 
achievement scores were then correlated with 
actual achievement scores. 


RESULTS 


To determine the strength of the relation- 
ship of specific observable classroom behav- 
iors obtained in arithmetic classes to aca- 
demic achievement, eight behavioral cate- 
gories were used as possible independent 
variables in a regression analysis in which 
means of various achievement tests were 
the dependent variables. 

It was hypothesized that a combination 
of specific task-oriented and non-task-ori- 
ented behaviors observed during arithmetic 
periods would be predictive of arithmetic 
achievement. Table 2 presents listings of 
the first five behavioral categories in order 
of their entry into separate regression equa- 
tions generated by data from each school. 
The listings provide the new multiple R 
which was obtained with the addition of 
another behavioral category. The final mul- 
tiple R was that obtained by using the en- 
tire eight behavioral categories. The first 
two rows in Table.2 list the observable be- 
haviors for predicting arithmetic achieve- 
ment in each school. The final multiple R 
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was .69 for School A and .63 for School B; 
these Rs provide strong evidence that a 
combination of specific behaviors are highly 
prediction of arithmetic achievement. 

The behaviors that contributed to the 
high multiple Rs were quite consistent 
across schools. The most powerful behav- 
ioral category in each school was “attend- 
ing,” which provided an E of .40 for School 
A and .47 for School B. Three of the next 
four behavioral categories, while not in the 
same order of entry into the regression 
equation for each school, were similar; they 
were “talk-to-peer-positive,” “compliance,” 
and “self-stimulation.” From the data anal- 
ysis it seemed that the same kinds of be- 
haviors were important to achievement in 
both schools. Although the validity coeffi- 
cients (see Table 3) for compliance and 
talk-to-peer-positive seem discrepant be- 
tween schools, a two-tailed test (Hayes, 
1963, p. 532) indicated no significant differ- 
ences. 

The crucial test that the observable be- 
haviors found to be predictive of arithmetic 
achievement in one school were also predic- 
tive in another school was accomplished by 
applying the beta weights for the three 
most powerful predictors to data from the 
other school. Correlations were then run on 
the obtained scores with the actual arithme- 
tic achievement scores (Table 4). This 
method of cross validation produced corre- 
lations of .58 for School A and .50 for 
School B. ) 

The second hypothesis, that behaviors 
observed in arithmetic would be predictive 
of success in reading and spelling, received 
support within schools, but the findings 


TABLE 2 


First Five PREDICTORS ror PREDICTED ARiTH 


METIC VARIABLES USING 


BEHAVIORAL CATEGORIES, 78, AND Rs 


Fourth Fifth 


Best Second Third I f i qi 
dde [88 2E NE NR |" 
n MN D vs 
All arithmeti 
ES hae AT 404| TPP .s04|SS .631| OC .652 | CO .087 | .694 
School B AT 1479| CO .564| TPP .605| VO .619 | SS .625 | .626 
Reading and spelli i 
ipelling subtests 
8 620/88 .5e4| VO .621 | CO .641 | .007 
mae wa POP S CO .454 | NA .477 |  .496 


School B 


OC .250| TPP .339| AT .405 
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TABLE 3 
CORRELATION MATRICES FOR SCHOOL A (n = 60) AND SCHOOL B (n = 43) DEPICTING INTERCORRELATION 
AMONG BEHAVIORAL CATEGORIES AND VALIDITY VECTORS FOR ÁCHIEVEMENT TESTS 


SCHOOL A 
| , _ | Reading 
AT TPP vo co ss oc Lo NA Arithmetic | and 
spel 
S | AT —.00 —.42  —.3  —.05 22 —.29 | —.50 .40 .25 
C. | TPP | —.46 —.82- —.15. —.l4 16, —.16. —.11 37 E 
H |VO | —.13 .08 .58  —.16 —.44  —.02 .94  —.98 -.42 
Oo |CO .06 .08 .32 —.32 .—.32 —.27 .51 .04 —.06 
O |SS —.62 18 —.08 —:10 —.09 .92 —.26 —.85  —.29 
L |Ooc —.72 19 04 —.15 .62 =.17 —.24 .38 37 
LO | —.03 —.28 .0  —.18 —.0  —.14 02  —.30 -.2 
B |NA | —.69 -26 01 — —.40 AT .38 .06 -—. -.H 
Arithmetic 48 —.01 14 .938 | —.30  —.388  Á —.15 — 48 
Reading and 
spelling 24 .18 .16 .24 —.22  —.25 —.14 —.17 


were inconsistent on cross validation. In the 
last two rows of Table 2 are listed the first 
five behavioral categories that contributed 
to the prediction of reading and spelling 
achievement for each school. The final mul- 
tiple R was .66 for School A and .50 for 
School B. These data provided partial con- 
firmation that behaviors observed in one 
academic setting are predictive of success in 
a different academic achievement area. 

In order to verify the relationship of be- 
havior and achievement in an unobserved 
academic area, the same cross-validation 
procedures that were used in arithmetic 
were applied to the reading and spelling 
data. Table 4 lists the three predictors that 
were used and the results. The cross-valida- 
tion procedures produced nonsignificant re- 


sults in School B and a correlation of .47 in 
School A. 

Post hoc analysis indicated that the main 
reason for lack of cross validation from 
School A to School B appeared to be the 
significantly different intercorrelations and 
validity coefficients for out-of-chair in each 
school (see Table 3). In School A, out-of- 
chair had a correlation of .37 with reading 
and spelling achievement, whereas in School 
B the correlation was —.25. The intercorre- 
lation of out-of-chair with attention was 22 
for School A and —.72 for School B; similar 
discrepancies were noted in its relations 
with talk-to-peer-positive and self-stimula- 
tion. It entered both regression equations 
but with a different sign and beta weight, 
the latter being considerably larger in 


TABLE 4 
Cnoss-VaLipATION RESULTS, List or VARIABLES, ORIGINAL MULTIPLE Rs, AND Cross-VALIDATED /8 
Dependent variable Vemm kir Lg Cross-validated |i n 
All arithmetic subtests AT, TPP, SS .63 School B .50 
(n = 43) 
AT, TPP, CO .61 School A 58 
(n = 60) 
Reading and spelling subtests TPP, OC, 8S .56 School B —.10 
(n = 43) 
AT, TPP, OC E: School À Ed 
(n — 60) 
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School A than in School B. The effect was 
to reduce the contribution of both talk-to- 
peer-positive and self-stimulation during 
cross validation. On the other hand, the 
combined beta weights of attention and 
talk-to-peer-positive more than compen- 
sated for the effects of out-of-chair in 
School A. The first step in the regression 
equation for School A resulted in the re- 
moval of talk-to-peer-positive due to its 
moderate correlation of .42 with reading 
and spelling achievement leaving a partial 
correlation of .31 for attention. Thus, when 
out-of-chair was used with a small beta 
weight in cross validation for School B to 
School A the effect was minimal. These re- 
sults suggest that prediction of achievement 
in a different academic area than that in 
which the behavior is observed needs fur- 
ther investigation. 


Discussion 


These findings suggest that specifying 
more discrete behaviors of the general re- 
sponse class of work-oriented behaviors 
provide stronger relationships to achieve- 
ment than those obtained in previous stud- 
ies. Thus, the child who talks about aca- 
demic material to his peer as well as at- 
tends to his work, is more likely to succeed 
than the child who attends without inter- 
acting with his peers. The finding that the 
behavior talk-to-peer-positive consistently 
became a powerful predictor within samples 
for reading and spelling and across samples 
for arithmetic suggests that the successful 
child receives more practice in academic 
skills through his social interaction than do 
Peers whose social interactions are less con- 
cerned with academics. 

Since the variable compliance held up 
well in the cross validation for arithmetic, 
it seems reasonable to assume that children 
who follow instructions given by the 
teacher are more likely to be achievers. 
This finding supports the work of many in- 
Yestigators who have reported that deviant 
children achieve at lower levels than their 
Peers (Glueck & Glueck, 1964; Robins, 
1966). 

_ The implications of these findings for en- 
gineering purposes suggest the possibility of 


altering key classroom behaviors using 
techniques of social learning developed by 
Patterson, Ray, and Shaw (1968), and 
Cobb, Ray, and Patterson (1970). Since 
cause and effect cannot be ascertained from 
the correlational approach used in the pres- 
ent study, empirical studies in which key 
behavioral rates are altered need to be con- 
ducted to determine if concomitant changes 
occur in achievement level. Other investiga- 
tors have attempted and succeeded in 
changing various personality variables that 
have been identified as correlates of aca- 
demic achievement, but no. significant 
change was noted in elementary school 
achievement (Berson, 1966; Fisher, 1962; 
Hall 1963; Miller, 1968; Munger, Winkler, 
Teigland, & Kranzler, 1964; Southworth, 
1966; Winn, 1962). Their findings indicate 
that the altered variables may be the prod- 
uct of achievement, rather than the cause. 
A counselor may change a child's self-con- 
cept, but has done nothing about the child's 
classroom behavior. It is hypothesized that 
if a child were taught to attend and talk to 
his peers about academic material at higher 
levels than base line, he has more chance of 
significantly increasing his achievement 
level than if his self-concept alone is al- 
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LEARNING-CRITERION ERROR PERSEVERATION IN 


TEXT MATERIALS' 
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Eighty-eight undergraduates, divided into high and low verbal abil- 
ity groups, learned a 24-frame program under conditions of either 
overt or covert responding to multiple-choice frame questions. Num- 
ber of incorrect question alternatives was varied as a within-subjects 
factor. Posttest scores showed facilitation for overt responding and 
ability but no reliable differences for number of errors available during 
learning. Mean time per frame was greatest in the case where no cor- 
rect choice was available. These results, in conjunction with prior re- 
search, seem to indicate that the perseveration of incorrect choices 
from learning to posttest is not a direct function of the transfer of 
learning errors per se, but rather of the design of the instruction itself. 
It is suggested that a requirement to respond to the materials facili- 
tates under conditions of adequate instructional design. The number 
and format of incorrect alternatives is not an influential factor in er- 
ror perseveration. Based on these results it was proposed that greater 
attention be paid to the criterion validity of lessons and less attention 


to varying formats and presentation styles. 


It has long been a canon of sound in- 
struction that errors made during learning 
are detrimental to later test performance 
(e.g., Skinner, 1954, 1968). While the pres- 
ent authors find no disagreement with this 
conclusion, there appear to be serious difü- 
culties in specifying which instructional 
characteristics are responsible for the trans- 
fer of learning errors to a criterion measure. 
The argument has been that error persever- 
ation is some function of variables such as 
_Meaningfulness” of the material, availabil- 
ity of incorrect alternatives during learning, 
and the mode of response learners are re- 
quired to make. Studies attempting to ex- 
plore these various possibilities often suffer 
from methodological flaws, and although 


———— 


"The experimental materials are adaptations of 
earlier programs developed by O'Day and Kulhavy 
for use in their prior research studies and those of 
Other collaborators. These studies are reported in 
Programmed instruction: Techniques and trends. 
New York: Appleton-Century-Crofts, 1971. We 
are grateful to the publisher for releasing this 
copyrighted material. 

quests for reprints should be sent to Ray- 
mond W. Kulhavy, who is now at the Department 
of Educational Psychology, Arizona State Univer- 
sity, Tempe, Arizona 85281. 
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their results are frequently used as a base 
for instructional design, there appears to 
have been no attempt to test error transfer 
effects with expository text materials. 

The question of error perseveration as it 
relates to both the amount of incorrect ma- 
terial present during learning, and the mode 
of response, has been approached in two 
studies by Kaess and Zeaman (1960). The 
first experiment manipulated initial error 
availability by varying the number of in- 
correct alternatives presented to subjects 
answering 30 multiple-choice test items. 
Learners responded to the items by punch- 
ing their choices in a Pressy-style punch- 
board. As predicted, initial errors tended to 
perseverate to later trials. However, this 
error transfer effect failed to occur in the 
second experiment where subjects were not 
required to overtly respond on the initial 
trial. Apparently, errors span the learning- 
posttest. interval only when the learner 
must produce a physical discrimination 
among the alternatives. These results seem 
to indicate that the requirement to respond 
overtly helps somehow to “fix” the incorrect 
choice in the learner’s repertoire, perhaps 
by forcing closer inspection of the material 
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and, consequently, better" learning of the 
errors. 

A similar pair of studies by Elley (1966) 
varied both the number of initial error al- 
ternatives and the judged meaningfulness of 
the multiple-choice test items. All of the 
subjects in these experiments responded 
overtly by entering their answers into a 
punchboard. Results for the low meaningful 
items were essentially the same as for the 
Kaess and Zeaman (1960) overt group. 
However, for the items rated as high mean- 
ingful, the number of initial wrong alterna- 
tives available did not have a significant 
influence on test-trial performance. Elley 
seems to indicate that merely using more 
meaningful items is enough to destroy per- 
serveration effects. However, since the 
Kaess and Zeaman questions dealt with 
basic psychological terms (we leave the 
reader to judge meaningfulness in this 
case), it is hard to see how the series and 
analogies problems used by Elley could be 
rated as any more understandable or famil- 
iar to the learners. 

In addition to these confusing results cen- 
tering on characteristics of the materials, 
both sets of studies appear subject to poten- 
tial confounding simply because those sub- 
jects having more wrong alternatives ini- 
tially available consequently had greater 
amounts of interfering material with which 
to contend. Under these conditions the more 
errors a learner is faced with the more he is 
likely to make, especially with unfamiliar 
material. 

The net result of these studies suggests 
that learning-posttest error perseveration is 
a function of at least three factors. First, 
the rated meaningfulness of the items used; 
second, the amount of incorrect material 
available during learning; and third, the re- 
sponse mode required of the learner. 

Considering only meaningfulness, it is 
unclear what prediction one should make 
concerning perseveration effects with an in- 
structional text. In this case, where the rel- 
ative meaningfulness of the material is es- 
tablished (at least in the present research) 

we would tentatively hypothesize that per- 
severation would occur under specified con- 
ditions, since these effects were found with 
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the Kaess and Zeaman materials which 
most nearly meet the accepted form and 
content, of instruction. Hence, one purpose 
of the present experiment is to determine 
under what conditions, if any, a direct link 
is forged between learning and posttest er- 
rors using meaningful programmed materi- 
als. 

Regarding the second factor, error avail- 
ability during learning, it seems that if a 
direct relation exists between learning and 
posttest errors one could predict that the 
two extreme learning conditions, all errors 
and no errors, would lead to widely differ- 
ent posttest results. When the initial 
amount of incorrect material is equated 
across error conditions and learners, we 
would expect far higher posttest scores 
when a subject is unable to make an incor- 
rect learning response than under conditions 
where he is forced to make a wrong choice. 
The second objective of this study is to test 
the viability of this hypothesis. 

A third experimental test concerns the ef- 
fect on perseveration of whether or not an 
overt response is required. Since making an 
overt response appears to increase attending 
behavior (see Anderson, 1967) we would 
predict that learning responses are more 
likely to transfer intact to the posttest be- 
cause subjects pay closer attention to the 
choice made. A response mode by error- 
availability interaction should result, in 
which the overt group shows much greater 
differentiation than the covert subjects as 4 
function of number of wrong alternatives 
available during learning. 


MzrHoD 


General Design. 


Eighty-eight undergraduate volunteers from 
the University of Illinois were stratified into high 
and low verbal ability (VA) groups on the basis 
of a classroom pretest. Twelve subjects were 
chosen at random and assigned to a control group 
which received the posttest but no instruction. 
The remaining 38 subjects in each of the high and 
low VA conditions were assigned, 19 each to the 
overt and covert respond mode (RM) condi- 
tions. The design to this point was a 2(VA) X 
2(RM) factorial design with 19 subjects per í d 
The 24 frames of programmed materials receive 
by each subject contained four types of ques 
tions. Each question type differed in the number 
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of error alternatives (EA) available to the learner 
after he had read the frame of text. For & given 
question, a subject could have all of the multiple- 
choice alternatives correct (EAÀ-0), one error and 
two correct alternatives (EA-1), two errors and 
one correct alternative (EA-2), and finally, three 
errors and no correct alternatives (EA-3). Each 
subject received six frame questions of each type 
in his program. Since the posttest consisted of the 
same 24 questions as were contained in the text, 
this time with only one correct alternative, it was 
possible to obtain four posttest scores for each 
subject, one for each of the EA combinations pre- 
sented during learning. The resulting design was 
a 2(VA) X 2(RM) X 4(EA) analysis of variance 
with repeated measures on the EA factor. 


Material 


A 30-frame teaching program (O'Day, Kulhavy, 
Anderson, & Malcynzski, 1971) on the structure 
and function of the human eye was reduced to 24 
frames averaging 86 words in length. The original 
program was reduced by randomly selecting six 
frames to be combined with the text of the frame 
on either side of them. The frame chosen for the 
combination was also randomly determined. Since 
ihe program was originally designed to read se- 
quentially there was no break in subject matter 
continuity caused by the reduction process. Each 
of the 24 frames included its corresponding three- 
alternative, multiple-choice test item. For every 
frame question, two additional correct and three 
incorrect alternatives were generated. Independent 
judges rated the new alternatives for their simi- 
larity to one another and to the original items. 
From the pool of alternatives now available, four 
different choice patterns were possible for each 
question, that is, no errors-all correct (EA-0), one 
error-two correct (EA-1), two errors-one correct 
(EA-2), and three errors-no correct (EA-3). To 
control for bias due to the varying length and 
content of frames, the assignment of EA choice 
patterns to frames was randomized within the 
program booklets. The sole restriction on this 
randomization required that six items from each 
of the four EA levels appear in each subject's 
program. To equate presentation across VA and 
RM levels, the within-book randomization was 
carried out on sets of four booklets, one for each 
factorial cell. 

In addition to the program a second booklet 
containing six text-related drawings with instruc- 
tions for their use was included with each presen- 
tation. 

The posttest items were identical to those given 
during the program, with the exception that they 
contained only the alternatives from the original 
program. 


Procedure 


Prior to the experiment, the subject population 
Was given a 48-item verbal ability measure 
(French, Ekstrom, & Price, 1963). The resulting 
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score distribution was median split into high and 
low VA diads—the same number of subjects in 
each half. Each subject was randomly assigned 
to an RM level according to his VA standing. 
Subjects participated in groups of from 5 to 16 
with individuals from each treatment in every 
session. Taped instructions directed subjects to 
work at their own rate of speed, choosing what 
they considered to be the one best answer to each 
question, Although cautioned that the discrimina- 
tions might be difficult, subjects were not in- 
formed of the differences in choice patterns, nor 
that they would receive a test following the pro- 
gram. Instructions within the booklets required 
the overt response group to underline their re- 
sponse choices. The covert subjects were told to 
respond silently, making no mark in the program 
booklet, Each participant recorded the completion 
time for each frame to the nearest 15 seconds from 
a visible timeboard. The posttest was administered 
in a separate room immediately following the 
program with no time limit imposed on its com- 
pletion. Control subjects received the posttest 
without exposure to the instruction. 


REsvuLts 


Table 1 presents the means and standard 
deviations for posttest corrects collapsed 
across EA levels. A VA x RM Xx EA anal- 
ysis of variance yielded significant main ef- 
fects for VA (F = 5.26, df = 1/72, p < 
025) and RM (F = 7.30, df = 1/72, p < 
.01). Neither differences between EA levels 
nor any of the resulting interaction terms 
reached statistical significance. Apparently 
posttest performance is increased by the re- 
quirement to underline answers; however, 
this facilitation appears unrelated to the 
number of error choices available during 
learning. 

Table 2 contains means and standard de- 


TABLE 1 
MEANS AND STANDARD DEVIATIONS ON POSTTEST 
Corrects SEPARATELY FOR ABILITY LEVELS, 
RESPONSE CONDITIONS, AND NONINSTRUCTED 


CONTROLS 
Verbal ability 
RO. High Low Controls 
M | D |M | SD | M | SD 
Overt 20.53 1.74 |18.47| 2.63 |9.06*| 1.47 
Covert |18.21| 2.76 |17.37) 3.28 


* No instruction given (n = 12). 
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TABLE 2 


MEANS AND STANDARD DEVIATIONS (IN SECONDS) PER FRAME ACROSS RESPONSE-MODE LEVELS ror 
EacH Error Irem TYPE 


Number of errors available per frame 


Respense condition 0 1 2 3 
M SD M SD M SD M SD 
Overt 64.80 10.82 63.30 12.15 62.11 12.71 71.80 9.46 
Covert 63.31 11.92 61.80 13.01 61.85 11.52 73.80 11.38 


viations in seconds per frame across RM 
levels for each EA item type. The analysis 
of variance for these data showed signifi- 
cant effects for RM (F = 4.82, df = 1/72, p 
< 05), EA (F = 27.03, df = 3/216, p < 
01), and the RM x EA interaction (F = 
5.42, df = 3/216, p < .01). No other effects 
were significant. The EA treatment means 
were further compared by the Tukey b pro- 
cedure (Weiner, 1962). The difference be- 
tween EA-3 and the remaining three condi- 
tions was the only comparison to reach sta- 
tistical significance (p < .01). The signifi- 
cant RM X EA interaction was primarily a 
function of the overt-response subjects tak- 
ing a disproportionately greater time per 
EA-8 item than the covert subjects. As 
would be expected, underlining items re- 
quires more time; however, the overt mean 
frame times show a sharper rise than the 
covert group in the case where the subject is 
faced with an item containing no correct 
choice at all. 

Learning errors were computed for the 
overt learners on the EA-1 and EA-2 items 
which were the only questions on which 
these data were available. The error rate 
was about 7% for these subjects. Counting 
EA-3 items as errors and EA-O items as 
corrects the conditional probability of a 
correct posttest response, given a correct re- 
sponse during learning, is P(R|R) = 92, 
whereas the probability of a correct posttest 
response, given an error during learning, is 
P(R|W) = .63. The difference between 
these two probabilities is significant (Z = 
9.03, p < .01) offering strong support for 
our original general assumption of learn- 
ing-posttest response perseveration, at least 
under the conditions for the overt-response 


subjects. It was not possible to calculate 
these same values for the covert subjects 
since they made no response in the program 
booklet during learning. 


Discussion 


These results provide little support for 
the predicted effects of varying error avail- 
ability during learning. The posttest analy- 
sis fails to demonstrate that the mere fact 
of forcing a learner to respond incorrectly 
will result in transfer of that error to the 
posttest. In fact, none of the error choice 
patterns yielded a measurable difference in 
posttest performance. The most important 
finding concerns this lack of posttest differ- 
ences between the EA conditions. When the 
number of learning alternatives is equated, 
incorrect learning choices decrease the 
probability of a correct posttest response. 
However, the lack of differential effects for 
EA levels indicates a need for reevaluation 
of the hypotheses that a direct causal rela- 
tionship exists between learning and post- 
test errors. 

The present data, in conjunction with 
previous findings, suggest that the transfer 
of learning errors to the posttest may be a 
function of nothing more than the student's 
inability to respond correctly in the first 
place. The fact that a learner is forced to 
make an initial error appears to have little 
influence on whether or not he will respond 
correctly on the posttest, at least with text 
materials. In the previous research de- 
scribed, both the Kaess and Zeaman (1960) 
items and the meaningful materials used by 
Elley (1966) consisted of factual questions 
to which the correct answer is not initially 
available to subjects not familiar with the 
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subject matter. Elley's meaningful items 
consisted of logical series problems and ver- 
bal analogies, both of which can be solved 
without prior familiarity in the same fash- 
ion as similar items on intelligence tests. 
That no perseveration effect was demon- 
strated with these materials supports the 
contention that direct error transfer de- 
pends on whether the learner can answer 
the questions by simple inspection of the 
instruction. In this case, the only errors to 
perseverate were those due to the subjects’ 
lack of understanding. When the learner 
makes an incorrect choice because he fails 
to understand the question, or because it is 
difficult (or impossible in some instances) to 
answer from the text, we would expect that 
error to perseverate to the posttest. This 
explanation accounts for the higher proba- 
bility of a posttest error given a learning 
error, but seeks to place the blame on in- 
comprehension rather than the format of 
presentation. 

If our lack-of-understanding hypothesis 
is a valid explanation for perseveration, one 
could expect learners to pay greater atten- 
tion to answering questions for which no 
correct answer is available. Under this con- 
dition, subjects would work harder at stud- 
ying the material simply because they are 
unable to find a correct response within 
their normal study interval. However, once 
the correct answer is gleaned from the text 
it makes little difference which alternative 
is chosen. In the present experiment, the 
finding that all-error items took signifi- 
cantly longer than any of the remaining 
three EA possibilities supports this assump- 
tion. In the EA-0, EA-1, and EA-2 condi- 
tions a correct response was available; con- 
sequently the amount of time spent deter- 
mining the right answer was reduced be- 
cause subjects were able to locate a match 
between an alternative and the appropriate 
text material. This effect becomes more ap- 
parent under a requirement to overtly re- 
Spond, possibly because subjects are some- 
What hesitant to make the actual choice be- 
tween errors. This same factor may have 
operated to yield superior posttest perform- 
ance for the overt group, since the condi- 
tions of the experiment forced these sub- 
jects into a closer perusal of the text prior 


to making a publicly observable response. 
These results are in agreement with Ander- 
son’s (1970) conclusion that any manipula- 
tion which requires the subject to pay 
greater attention to the material will lead 
to the possibility of increased learning. One 
might speculate on differences in the RM 
results had the subjects been given feedback 
following their frame responses. However, 
neither the Kaess and Zeaman (1960) nor 
Elley (1966) results bear out this assump- 
tion. The feedback given would have to be 
binary (yes or no) in nature containing as 
little information as possible. Since the 
posttest values in Table 1 show that treat- 
ment subjects learned a great deal from the 
program, compared to the control group, it 
seems unlikely that addition of minimal 
feedback would have seriously effected their 
scores. Combined with the evidence that lit- 
tle difference exists between groups receiv- 
ing feedback and no feedback in pro- 
grammed lessons if the feedback itself can 
be obtained by a subject prior to responding 
(see Anderson, Kulhavy, & Andre, 
1971a, 1971b), these facts indicate that ef- 
fects of providing the type of feedback pos- 
sible in this case would be minimal. 

On the present evidence it seems justifia- 
ble to hypothesize that making a learning 
error per se does not guarantee the transfer 
of that error to the posttest. Rather, we pre- 
fer the almost trite notion that learning- 
posttest error perseveration occurs because 
the learner cannot obtain correct informa- 
tion during learning. Only when the subject 
is unable to solve the given problem would 
we expect the learning error to transfer di- 
rectly to the posttest. Furthermore, since 
any variable which increases inspection be- 
havior should lead to better learning, the 
requirement to overtly respond will result 
in superior learning of either errors or cor- 
rects depending on whether or not questions 
can be answered from the material given. 

If the incomprehension-hypothesis is 
valid, the conclusions regarding instruc- 
tional design become obvious. When mate- 
rial is constructed in such a manner that 
students can answer questions through 
study alone, the format and availability of 
incorrect choices should have no direct det- 
rimental effects on posttest performance. 
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Under the conditions of properly designed 
instruetion, requiring subjects to indulge in 
closer inspection through the medium of an 
overt response should result in greater 
learning gains (cf. Kemp & Holland, 1966). 
Manipulating subject variables appears to 
be a profitable venture provided instruction 
is adequately presented. It is hoped that 
this conclusion serves to focus attention on 
problems of content validity and away 
from the guest for pancreatic formats and 
presentation styles. 
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search design, and the application of statistical 
solutions — precisely defining terms as they are 
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widely used techniques, designs, and proced- 
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antithetical issues and viewpoints brought to- 
gether in this new collection. The editor, a lead- 
ing educational psychologist, believes that only 
through genuine reform in our schools can we 
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own words, “This is a book about the psychology 
of teaching and learning. It is for teachers and 
it has only one purpose: to help build teaching 
skills that are psychologically sound.” 
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classrooms rather than ivory towers, Dr. Charles 
helps your students develop a practical, no- 
nonsense approach to teaching. Lively, pragmatic 
examples give meat to abstract theory. The author 
vividly crystallizes the basic principles of learning 
in an easy-reference list and provides a valid, 
intriguing model of teaching. Your students will 
especially appreciate the concrete examples that 
demonstrate ways to motivate learning. Analyzing 
verbal interaction, formulating questions that 
trigger cognitive processes, and planning instruc- 
tion are only a few of the topics that can foster 
creative teaching! 

And, as the material reflects today’s action- 
oriented student, the design of the text echoes 
Dr. Charles’ sparkling commentary. Format, 
type styles, and graphics maintain student in- 
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paper..."—DARHL M. PEDERSEN, 
Brigham Young University, in a pre- 
publication comment 


"The three parts of the book are 
distinctly different and the organiza- 
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immediate attention to this fact. It 
also makes the book more usable as 
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might want to look up suggestions for 
a particular problem...a good job 
of providing respectable answers.” 


"This book of readings does the best 
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on real issues that | have seen..."— 
JAMES J. KEIFERT, Washington State 
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and exciting in educational psychol- 
Ogy..."—ALAN HOFMEISTER, Utah 
State University 
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U.S. and U.S.S.R. 

By Urie Bronfenbrenner 
In this landmark experimental study, Urie Bronfenbren- 
ner, Professor of Human Development and Family Studies 
at Cornell University, provides an incisive comparison 
between Soviet and American education that clearly 
illustrates the basic differences in child development 
between the two countries. 
“This is surely one of the most important books in the 
field. of child rearing — for scientist and layman alike — 
to have been published in the past quarter century.” 
»—Jerome S. Bruner, Harvard University. "Make no mis- 
take, this is one of the important books of this genera- 
tion . ... Here is a man who sees through the current 
chaos to the values and the verities in which our possi- 
bilities as a people truly lie."—John H. Fischer, Presi- 
dent, Teachers College, Columbia University, in Saturday 
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HIGH SCHOOL 
Edited by Ronald Gross 
and Paul Osterman 


Teachers, students, drop-outs and 
radical critics such as Theodore 
Roszak, Edgar Z. Friedenberg, and 
Jonathan. Kozol, indict the nation's 
high Schools and describe seven 
successful experimental schools — 
public and private — black and white. 
"This is a very important collection, 
which should be read by everyone 
who ‘is concerned with the growth 
and «education of young people.” 
> -John Holt. 
i 352 pages $2.95 (March) 
' TEACHER 
~ By Sylvia Ashton-Warner 
.. "A lesson for all the world to study — 
... , for.all, anywhere, who live and work 
with children."—The New York Times 
Book Review. 224 pages $2.95 


COMMUNITY CONTROL OF SCHOOLS 
Edited by Henry M. Levin 


Experts from a variety of disciplines 
Consider what may well be the final 
option for effective education in 
urban America. 316 pages $2.95 


191 pages $3.95 (March) 


RADICAL SCHOOL REFORM 
Edited by Ronald and Beatrice Gross 


"A wonderfully useful volume which 
brings together excerpts from some 
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VARIABLE ADJUNCT QUESTION SCHEDULES, 
INTERPERSONAL INTERACTION, AND 
INCIDENTAL LEARNING FROM 
WRITTEN MATERIAL 


ERNST Z. ROTHKOPF* 


Bell Telephone Laboratories, Incorporated, 
Murray Hill, New Jersey 


Previous work indicates that questioning by teachers during individual 
study produced better instructional results than written questions 
embedded in text. Were these results due to questioning procedures 
or periodic social interaction with isolated students? Is interval be- 
tween questions important? High school students (n = 179) read a 
14,200-word science text. Treatments differed in (a) use of text-rele- 
vant adjunct questions or “social queries"; (b) questions embedded in 
text or asked by monitor; and (c) regularity of interval between ques- 
tions. A retention test indicates (a) that text-relevant questions are 
a critical ingredient for the facilitative effect of contact with the 
teacher-monitor, and (b) intermittent question schedules are no better 
than regular intervals and may actually be worse. 


Questions personally asked by monitors 
during the course of study were found to 
result in more incidental learning from 
Written material than the same adjunct 
questions embedded in the text in written 
form (Rothkopf & Bloom, 1970). The main 
aim of the present experiment was to deter- 
mine whether the observed social facilita- 
tion resulted from monitors asking text-rel- 
evant questions or whether it was due to 
Periodic contact with a teacher or monitor 
during study. A secondary purpose was to 
explore whether variable intervals between 
questions have some role in determining the 
Instructional effectiveness of adjunct ques- 
tions. Fixed intervals between questions 
have been used in most previous studies. 
panad 


x. The author is indebted to Ann Gormeley and 

à en Robinson of Rutgers University for their 
elp in the collection of data. 

i Tequesta for reprints should be sent to Ernst 
- Rothkopf, Learning and Instructional Processes 

Poen] Group, Bell Laboratories, 600 Mountain 
venue, Murray Hill, New Jersey 07974. 
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METHOD 


Materials 


Approximately 14,200 words from a high school 
geology text (Earth Science, Fletcher & Wolfe, 
1959) were used? These materials were a shortened 
version of a text used in a previous study (Roth- 
kopf & Bloom, 1970). The material was photo- 
graphed on 90 negative 35-millimeter slides. One 
adjunct question was selected for each six text 
slides. They were the same as those used in the 
previous study. 


Apparatus 


The experiment was conducted at Hunterdon 
Central High School, Flemington, New Jersey, in 
a room especially set aside for this purpose. Each 
subject was seated in a wooden portable booth 
that was fitted for rear projection of the text ma- 
erial. The projector was operated by the subject 
with a hand switch. The cireuits connecting the 
subject/s switch to the projection equipment were 
wired through control panels in the experimenter's 
booth. This allowed the experimenter to monitor 


3Permission for the experimental use of these 
copyrighted materials was kindly granted by the 
publishers, D. C. Heath and Co., Boston, Massa- 
chusetts. 


Copyright © 1972 by the American Psychological Association, Inc. 
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each subject’s progress through the text by means 
of automatic cumulating counters and to record 
automatically time spent by a subject in inspecting 
each slide. The control panel also included one 
preset counter for each subject, wired so as to dis- 
able a subject’s control switch when a predeter- 
mined page was reached. For further details of the 
apparatus and general procedure see Rothkopf and 
Bloom (1970). 


Treatments and Procedure 


Subjects entered the experimental room individ- 
ually, but three subjects were run at one time. 
Each group of three subjects was assigned in an 
unbiased manner to one of six experimental con- 
ditions. The original experimental plan called for 
33 subjects per experimental condition, but due to 
apparatus failure some subjects had to be dis- 
carded, Regardless of treatment assignment, sub- 
jects read the first 24 text slides without exposure 
to any questions and without further contact with 
the experimenter. The treatments were as follows: 

Treatment WR (written questions-regular in- 
terval, n = 32). A written question? was presented 
immediately after Text Slide No. 30 and after 
every sixth text slide thereafter. The question, on a 
negative slide, was projected in the same manner 
as the text slides. 

Treatment WI (written questions-irregular in- 
terval, n = 32). Adjunct questions were exactly 
the same as in WR but presented at irregular in- 
tervals that ranged from 2 to 13 text slides. The 
frequency distribution of the lags between location 
of information and the appearance of a question 
over all 11 questions was matched exactly to the 
lag distribution of the WR condition. Lag was 
defined as the number of slides between the slide 
on which information was presented and the occa- 
sion on which a question was asked. For example, 
if certain information was presented on Slide 10 
and a question was asked about this information 
immediately after Slide 15, the lag was five. The 
frequency of lags of various magnitudes was deter- 
mined for the WR treatment. Eleven intermittent 
questioning schedules were then composed for the 
WI treatment, Each of these schedules used the 
same questions as the WR treatment but with 
questions occurring in different randomly deter- 
mined locations, with the restriction that no ques- 
tion precede the presentation of relevant infor- 
mation in the text sequence and that the frequency 
of various lags be exactly the same as WR. Each 
group of three subjects was under a different ques- 
tioning schedule. 

Treatment OR (oral questions-regular interval, 
^ = 24). This treatment was exactly the same as 
WR except that the monitor (the experimenter) 
asked the question. This was done in the following 
way. The preset counters in the experimenter’s con- 


*The adjunct questions and other test ma- 
terial used in this study may be obtained from 
the author. 


ERNST Z. ROTHKOPF 


trol panel were set to disable the subject's slide 
control switch after every sixth slide in the ge. 
quence between Text Slides 25 and 90. Prompted 
by a flashing red indicator light, the experimenter, 
would walk to the subject’s booth and ask the 
scheduled question. The questions were spoken in 
a low, well-intoned, but relatively neutral voice, 
During questioning, the experimenter stood be- 
hind the subject’s right shoulder. Normally there 
was no eye contact between the experimenter and 
the subject. However, the experimenter did not 
systematically attempt to avoid such contact. If a 
subject paused after the question was read, it was 
repeated once. After the subject responded, the 
experimenter left the booth without providing any 
feedback, 

Treatment OI (oral questions-irregular interval, 
n = 28). The same questioning schedules were used 
as for WI, but the questions were asked by the ex- 
perimenter using the technique described under OR 
above. 

Treatment SCR (social contact only-regular 
interval, n — 31). This treatment was similar to 
OR. However, in place of oral questioning about 
the experimental passage, the experimenter asked 
neutral questions that were unrelated to the text, 
for example, “Is the focus of the slide projector 
satisfactory?” E 

Treatment NOEQ (no experimental questions, 
n = 80). In this treatment the experimental pas- 
sage was inspected without interruption by either 
experimental question or experimental social con- 
tact. 

After the first 66 text slides had been inspected, 
each subject regardless of treatment left the room 
for a break of approximately 4 minutes’ duration. 
During this time, the experimenter changed the 
slide tray of the projector. No contacts occ 
between subjects during the break. 


Testing Procedure 


In order to provide a time interval between 
acquisition and testing, the experimenter broug? 
a 18-item questionnaire concerning the subjects 
reading interest to the subject’s booth immediately 
following the last text slide. The questionnaire was 
removed exactly 10 minutes later even though 10 
subject was able to complete it during that interv® 
Immediately after the questionnaire, the Gne 
recall test (CT) was administered, consisting of 
questions from Text Slides 1 to 24 and 18 questions 
from Text Slides 25 to 90. The questions wel 
chosen to have no direct overlap in content Wl 
the experimental questions used during reading: 
Each question required a one- or two-word answer. 
No time limit was set for the test. T call 

Following completion of the criterion Te T) 
test, subjects were given a second test (FQR 
consisting of the 11 experimental questions u 5 
during reading. However, due to late starting AES 
and slow reading, some subjects did not have tim 
to complete this test. 


VARIABLE ADJUNCT QUESTION SCHEDULES 


Subjects 

Paid volunteer students (n — 179) from Hunter- 
don Central High School, Flemington, New Jersey, 
participated in the experiment shortly after their 
last school period.‘ 


RESULTS 


Criterion Test 


The criterion test was divided into two 
parts: (a) the eight questions derived from 
Text Slides 1 to 24, that is, the untreated 
portion of the text; and (6) the 18 ques- 
tions about Text Slides 25 to 90, that is, the 
portions of the text in which the treatments 
took place. The two parts were scored sepa- 
rately. Using performance on Part 1 as con- 
trol variable, an analysis of covariance of 
correct responses on Part 2 indicated that 
the experimental treatments produced sig- 
nificant effects (F = 2.28, df = 5/176, p < 
05). The adjusted means for the several 
treatments are shown in Table 1. 

The hypothesis that the facilitating effect 
on study activities produced by periodic 
questioning by monitors was simply due to 
periodic contact with the monitor was 
tested by comparing the OR treatment (X 
= 1133, 62.94%) with SCR (X = 934, 
51.88%). The OR treatment resulted in a 
substantially greater number of correct re- 
sponses (t = 3.05, df = 176, p < .01). The 
SCR-treatment subjects performed at about 
the same level as the NOEQ group. It can 
therefore be concluded that text-relevant 
questions were an important ingredient of 
the facilitating effects that have been ob- 
served in connection with the OR condition. 

The hypothesis that a variable interval 
between adjunct questions produced better 
Incidental learning than regular intervals 
was rejected because the average of the WR 
and OR conditions (10.70, 59.44%) was ac- 
tually somewhat higher than the mean of 
the combined WI and OI treatments (t = 
130, df = 176, p > .10). 

The results from the criterion test also 
confirmed two results of previous experi- 


‘The author is grateful to Richard D. Bloom 
md the principal and staff of Hunterdon Central 
i £h School, Flemington, New Jersey, for their 
operation in obtaining volunteer subjects. 


89 


TABLE 1 
Mean ADJUSTED CRITERION Tust Scores, EQRT 
Scores, AND Inspection RATES (SYLLABLE/ 
MINUTE) ON TEXT SLIDES 25-90 FOR THE 
Various TREATMENTS 


Treatment Criterion test EQRT Inspection rate 
WR 10.13 7.84 117.65 
wi 10.07 6.56 121.21 
OR 11.33 8.38 112.66 
OI 10.18 7.66 101.16 
SCR 9.34 5.51 111.65 
NOEQ 9.54 6.00 113.47 


Note.—Abbreviations: WR = written ques- 
tions-regular interval; WI — written questions- 
irregular interval; OR = oral questions-regular 
interval; OI = oral questions-irregular schedule; 
SCR = social contact only-regular interval; 
NOEQ = no experimental questions; EQRT = 
experimental questions used during reading. 


ments. The first of these (see Rothkopf & 
Bloom, 1970) was that adjunct questions 
presented by a monitor (OR) under regular 
questioning schedules produced better per- 
formance than written questions (WR) 
embedded in the text (t = 1.89, df = 176, p 
< .05, one-tailed). The second was that ad- 
junct questions (WR + OR + WI + OLX 
= 10.41, 57.85%) can result in better in- 
structional results than reading without 
text-relevant questions (NOEQ + SCR, x 
= 944, 52.44%, t = 2.46, df = 176, p < 
02). 


EQRT Test 


The findings from the EQRT, in general, 
were in keeping with the results from the 
criterion test. The data are shown in Table 
1. Neither the SCR nor the NOEQ had been 
exposed to adjunct questions. Therefore, 
analysis of the EQRT results for these two 
treatments was not appropriate. Compari- 
son of the means of the oral questionings 
groups (OR + OI, X = 8.00, 72.72%) with 
the written question condition (WR + WI, 
X = 704, 64.01%) indicated better per- 
formances after oral questioning (¢ = 2.12, 
df = 102, p < .05). 

The EQRT data also indicated stronger 
performance following regularly spaced 
questioning than following irregular in- 
terquestion intervals. The difference between 
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WR + OR (X = 7.97, 72.43%) and WI + 
OI (X = 7.07, 64.27%) was significantly 
greater than chance (¢ = 1.99, df = 102, p 
< .05). It can be concluded that the direct 
instructional effects of questions were 
stronger with regular than with irregular 
interquestion intervals. 

The higher performance levels on the 
EQRT relative to the criterion test for all 
but the SCR and NOEQ treatments re- 
flected previous experience (i.e., rehearsal) 
with the adjunct questions during reading. 


Other Observations 


Inspection rates. Experiments on the ef- 
fects of adjunet questions on reading rate 
have produced weak and somewhat confus- 
ing resulis (Rothkopf & Bloom, 1970). 
Local effects have been observed in that 
inspection rates for materials that follow 
immediately after questions tend to be 
somewhat slower than inspection rates for 
the remaining text. But the materials that 
immediately follow the questions are a very 
small portion of the total text, and differ- 
ences between overall treatment averages 
either in inspection time or reading rate 
have been found to be small and inconsist- 
ent (Rothkopf & Bisbicos, 1967). Negative 
correlations between average reading rate 
per treatment and the average number of 
correct responses on the criterion test have 
usually been observed, but these correla- 
tions have been small. 

Inspection rates in syllables per minute 
for Text Slides 25-90 are shown in Table 1. 
They have been adjusted for individual dif- 
ferences in inspection rate by the covari- 
ance technique, using inspection rate on 
Text Slides 1-24 as a control variable. The 
findings of the present experiment confirm 
previous results. Differences among treat- 
ments were significant (F = 3.978, afr 
1/112, p < .05), but only the comparison 
between the WI and OI treatment was reli- 
able, using the Newman-Keuls method (p 
< .05). The rank correlation between X 
inspection rate per treatment and X test 
performance was low (o — .371). 

Adjunct questions and individual differ- 
ences. Rothkopf (1969, p. 126) has pro- 
posed that adjunct questions would be 


most likely to have useful and measureable 
instructional consequences if inspection 
(mathemagenic) activities were ineffective 
or deteriorating. Ineffective inspection ac- 
tivities may be indicated by low perform- 
ance on test items derived from Slides 1-24, 
that is, the untreated portions of the text, 
although such low performance may also be 
due to low ability. An indication of deterio- 
rating inspection activities may be rapid in- 
creases in inspection rate. The best indica- 
tor of ineffective inspection activities is 
probably a combination of low test per- 
formance on items from Slides 1 to 24 and 
rapid acceleration of inspection speed. An 
attempt was made to test this conjecture in 
the following way. Subjects were selected 
from the two major treatments (Question, 
WR + OR, vs. No Questions, SCR + 
NOEQ) according to learned performance 
and change in inspection rate on the un- 
treated portions of the text (Slides 1-24). 
Subgroups selected on the basis of these two 
variables were then compared with respect 
to the effects of adjunct questions on per- 
formance on the criterion test. With respect 
to learned performance on the untreated po- 
sitions of the text, that is, proportion of 
correct responses (Pi) on test items from 
Slides 1 to 24, two groups were selected 
from each of the two treatments: (a) P1 > 
62 but not > .87; and (b) P, € .37 but not 
< .12. Subjects with P, = .50 were not used 
in order to provide better separation be- 
tween the two groups, and subjects with Ps 
> .87 or < .12 were not used to avoid arti- 
facts associated with extreme scores. 
Change in inspection rate was determined 
by subtracting average inspection rate on 
Slides 13-24 in syllables/minute from aver- 
age inspection rate on Slides 1-12. This dif- 
ference was the basis for casting subjects 
into two groups: (a) those whose inspec- 
tion-rate change was —1 syllable/minute or 
less; and (b) those with changes greater 
than —1 syllable/minute. . 
The requirements of this 2 x 2 selection 
scheme were quite restrictive, and conse- 
quently only nine subjects could be placed 
in each cell of the 2 x 2 classification in 
each of the two treatments. Average pro- 
portion correct responses on test items from 
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ADJUNCT QUESTIONS 
— — — NO QUESTIONS 
O @ ACCELERATORS 
NON-ACCELERATORS 


0.6 


0.4 0.8 


PROPORTION CORRECT RESPONSES, 
UNTREATED TEXT 
(SLIDES 1-24) 


Fra. 1. Proportion correct responses on test items from Slides 25 to 90 as function of pro- 


portion corrected responses on items from the untreated portion of the text. 


(Accelerators 


are subjects whose inspection speed increased by at least one syllable/minute in the first 24 


text slides; see text for further explanation.) 


Slides 25 to 90 was calculated for each of 
the cells of the two fourfold tables. The re- 
sults are shown in Figure 1. An inspection 
i this figure indicates that subjects with 
‘ow initial learning and rapid increases in 
Inspection rate tend to benefit most from 
Hase questions and that the conjecture 
: Slee palette of inspection activities 
rd t erefore be correct. Average perform- 
n of subjects with low initial learning 
id increases in inspection rate was 
iud vien lower for subjects with no ad- 
P di questions than for subjects receiving 
d a questions. There were hardly any 
WH hide ine to treatment for subjects 
AR igh initial learning performance re- 
ae of inspection speed changes. 

Rant s e X 2 analysis of variance (Treat- 
Rate) P: Level x Change in Inspection 
the dà using an are sine transformation of 
ee ontdent variable, indicated a signifi- 

eatment effect (F = 42.547, df = 


1/64, p < .001). However, the predicted in- 
teractions effects proved unreliable, al- 
though for Treatment X Pi Level, .05 < p 
< 10 (F = 2.949, df = 1/64). The data 
were not sufficient to reject the null hypoth- 
esis, but they were not inconsistent with the 
deterioration conjecture. 


Discussion 


The results of this experiment replicate 
the earlier finding that questioning by @ 
teacher during individual study produced 
better instructional results than written 
questions embedded in the text. The data 
indicated that the text relevance of ques- 
tions rather than social interaction was the 
facilitative ingredient of contact with the 
teacher-monitor. No special effects due to 
variable interval between questions were 
found except that it appeared to reduce the 
direct instructive effect of adjunct questions 
somewhat. There were some indications 


í 
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that adjunct questions, when they are facil- Rorsxorr, E. Z. Some conjectures about inspection 
itative, are most helpful to subjects whose behavior in learning from written sentences and 


E Bd : 8 odi lem i k 
mathemagenie activities are ineffective. m GR TN: Tes eem pes 
This effect would be important practically C. Roderick, D. J. Cunningham, & T. Andre 


and requires further explanation. (Eds.), Current research in instruction. Engle- 
Zajone (1965) concluded from his review _ wood Sari ERE HL 1069. a 
of the social facilitation literature that stu- Ret™xorr, E. Z. ISBICOS, E. E. Selective 


CE : facilitative effects of interspersed questioi 
dents should study in isolation booths and learning from written fateris, “Jena 5 


take their examinations with groups of Educational Psychology, 1967, 58, 56-61. 
other students. The now twice-observed so-  Rornxorr, E. Z. & Broom, R. D. Effects of inter- 
cial facilitation effect for adjunct questions pna biens on iss rre yae 
» J, = oi adjunc juestions in learning from writ 

T an amendment to Zajone’s recom: material. Journal of Educational Psychology, 
UAR 1970, 61, 417-422. 

REFERENCES Zasonc, R. B. Social facilitation. Science, 1905, 49, 
Fusrcnem, G. L, & Wouwe, C. W. Earth science, 269-274. 
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One hundred and forty-four male and female subjects (ages 6-9) 
performed on concept identification tasks designed to study the effects 
of availability of two types (past correct and past incorrect in- 
stances), amounts of memory information, and subjects’ sex in con- 
cept learning. Major results were: (a) females showed superior per- 
formance as compared to males, (b) availability of past correct 
instances facilitated concept identification performance, and (c) older 
females profited from memory information to a greater degree than 
their male peers, especially when past correct instances were pro- 


vided. 


Results of recent studies have shown that 
memory load can be an important variable 
influencing concept identification perform- 
ance. In such studies memory is manipu- 
lated by allowing a subject to view, on a 
given trial, one or more previous stimuli. 
This amounts to a reduction in memory 
load. Available stimuli can be limited to 
previous correctly classified or incorrectly 
classified patterns. There is considerable ev- 
idence that availability of past instances 
Improves concept identification perform- 
pte among adults (Bourne, Goldstein, & 

ink, 1964; Hunt, 1961; Pishkin & Wolf- 
gang, 1965). This is true also for memory of 
information across several concept identifi- 
cation problems. Dominowski (1965) re- 
kerted good retention of previously identi- 

ed concepts and of specific stimulus char- 
Tode in adult subjects, when past in- 
were avail j i 
Bor able to the subject during 
y addition to research on concept identi- 
ation and memory with adults, data exist 
Pontes 
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to support the contention that marked 
changes take place in maturing children in 
memory and cognitive processes (Kendler 
& Kendler, 1962; Olson, Miller, Hale, & 
Stevenson, 1968). Accordingly, a number of 
recent investigations have been aimed at an 
examination of the role of memory during 
problem solving in young children (e.g., In- 
glis, Ankus, & Sykes, 1968; Pishkin, Wolf- 
gang, & Rasmussen, 1967). Pishkin, Wolf- 
gang, and Rasmussen (1967) studied concept 
identification performance as a function of 
age (10 to 18 years) and number of past 
instances available per trial. It was found 
that performance of the younger subjects 
(age 10 to 12) improved more due to in- 
stance availability than did that of the ado- 
lescent subjects (age 13 to 18). This is true 
especially for high-complexity tasks. This 
finding is consistent with that of Bourne et 
al. (1964) who found that solution of more 
complex tasks was facilitated when pre- 
vious stimuli are made available to the sub- 
ject. Another related line of investigation in 
concept formation was conducted by Newell 
and Simon (1967) who investigated differ- 
ential short- and long-term memory effects. 
They concluded that learning rates in con- 
cept identification are dependent upon 
memory. It was postulated that long-term 
memory functions differently from short- 
term memory. The short-term storage 
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transfers information to long-term store 
which the subject has gained from the con- 
cept. Long-term memory, in turn, amasses 
and organizes this information to be tapped 
in succeeding tasks. 

Few studies have been concerned with the 
role of sex in cognitive performance. How- 
ever, any study of developmental changes 
in cognitive ability would seem to need to 
evaluate the effect of any differential rates 
of maturation for the two sexes. Pishkin, 
Wolfgang, and Rasmussen (1967) found 
that when memory load was reduced in 
younger subjects of both sexes their per- 
formance closely approximated that of 
older children who were not given the re- 
duced memory load task. Females generally 
benefitted more from memory information 
than did males; and females were more 
efficient than males in utilizing correct in- 
stance information beyond the availability 
of only one past instance. Availability of 
only incorrect instances failed to improve 
performance. Closely related to this finding 
are results reported by Tyler (1956, p. 254). 
In a study of memory tasks which required 
recall of digits and reproduction of geomet- 
rie patterns from memory, she found that 
females were superior to males. Contrary to 
these findings, Osler and Kofsky (1965) 
found no significant differences in the con- 
cept performance of 4, 6, and 8 year olds as 
a function of sex. 

In general, it may be concluded that the 
ability to utilize memory cues increases dif- 
ferentially with age for males and females. 
Past correct instances provide more effec- 
tive memory cues than past error instances, 
that is, more of the latter must be made 
available than of the former to equalize 
performance (Pishkin, Wolfgang, & Ras- 
mussen, 1967). However it is not clear from 
past results if sex differences in ability to 
use memory cues exist from birth or if 
males and females diverge at some particu- 
lar age. To clarify this point, this study was 
designed to investigate the effects of mem- 
ory cues; availability of correct and/or in- 
correct past instances upon concept identifi- 
cation learning rates of first- and third- 
grade males and females. 


VLADIMIR PISHKIN 


METHOD 


Subjects 


The sample consisted of 144 students from the 
first and third grades of an elementary school in 
Oklahoma City, Oklahoma. The students were di- 
vided in the following manner: 72 males and 72 
females; 72 first graders (6 and 7 years old); and 
72 third graders (8 and 9 years old) with 36 stu- 
dents of each sex randomly selected from the two 
grade groups. 


Design 

The experiment had a 2 X 2 X 2 X 3 X 2 fac- 
torial design with three students randomly as- 
signed per cell, for a total of 48 conditions. The 
following five main effects and their respective in- 
teractions were examined: (a) sex (male or fe- 
male), (b) grade level (first or third grade), (c) 
number of past cues available (one or two cues), 
(d) types of cues (past right, wrong, or combina- 
tion of both right and wrong), and (e) problem 
(color or form relevant). For all problems one of 
two dimensions (form or color) was relevant; 
these were alternated with every three subjects. 
The type of problem, color or form, was not ex- 
pected to be a significant source of variance and 
was manipulated mainly to preclude spread of in- 
formation about the solution among subjects as 
demonstrated by previous studies (Pishkin, Shur- 
ley, & Wolfgang, 1967; Pishkin, Wolfgang, & Ras- 
mussen, 1967). It was stressed that the subjects 
were not to talk about the task with other students. 


Procedure 


The apparatus consisted of a deck of 64 3 X 5 
inch cards based on the Wisconsin Card-Sorting 
test (Pishkin, Wolfgang, & Rasmussen, 1907) and 
a 22 X 28 inch poster marked off into four squares 
(categories). The stimuli consisted of two dimen- 
sions (color and form) with two levels per dimen- 
sion (red and blue for color and circle and square 
for form) creating unique stimuli. Each stimulus 
was repeated 16 times in the deck for a total of 
64 stimuli. The 2 X 2 poster grid was used to ex- 
pose correctly and/or incorrectly classified patterns 
from preceding trials. The rows were labeled yes 
or no to correspond to the subject’s classification 
of a pattern as an exemplar or nonexemplar of the 
concept. The columns were labeled R (right) or W 
(wrong) to correspond to correctly or incorrectly 
sorted stimuli. Thus, to expose a correctly sorted 
exemplar of the concept from a past trial, the ex- 
perimenter placed the card face up in the yes-R. 
box on the grid. 3 

The combination of number of past cues avail- 
able (one or two) and type of information avail- 
able (right or wrong) was denoted by 1 cue-R for 
one available correctly sorted card, 2 cue-W for 
two past incorrectly sorted cards, etc. 3 

The procedure was as follows: The stimuli were 
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. arranged in a random order with the restriction 


angi 
that no card should immediately follow itself. The 
relevant dimension was color (red or blue) for 
one-half of the subjects and form (circle or square) 
for the other half. Stimuli were presented singly 
by the experimenter. The subject’s task was to in- 
dicate which category, yes or no, he thought each 
stimulus should be placed in. The classification 
rule was based on the relevant dimension. For ex- 
ample when color was relevant blue would be 
sorted as yes and red as no. Thus in the 1 cue-R 
condition the subjects were instructed that the 
experimenter would tell them when they were 
right or wrong after classifying a card, and so one 
past correct instance would be left face up in the 
yes-R box, and one in the no-R box. Similar in- 
structions were given in the wrong and right- 
wrong conditions. The task was terminated after 
16 consecutive correct trials. 


RESULTS AND DISCUSSION 


The dependent variable was the number 
of errors over a maximum of 64 trials. An 
analysis of variance was performed on the 
error data with the following main effects: 
Sex, number of cues available, types of cues 
available, grades, and problems. 
The main effects of sex (F = 4.61, df 
= 1/96, p < .05), number of cues (F = 
782, df = 1/96, p < .01), and type of cues 
(F = 12.35, df = 2/96, p < .001) were 
significant. Consistent with previous results 
(Pishkin, 1960), the problems (form or 
color) nor any of its interactions were a 
Significant source of variance (Fs < 1). 
Males produced significantly more errors 
than did females (mean errors = 9.87 for 
males and 6.77 for females). This trend was 
due to the Sex x Grades and Sex x Types 
of Cues interactions discussed below. The 
am effect was also statistically reliable. 
Subjects given problems with two available 
mpende produced fewer errors than those 
oreg only one cue (mean errors = 10.48 
or one cue; 6.15 for two cues). 
7 ne prominent finding is that the effect 
= a type of instance available is highly 
ia e. Figure 1 clearly demonstrates that 
2 t errors were made by the subjects pre- 
jos im past correctly sorted instances. 
ids ability of wrong or right and wrong 
eae did not facilitate performance. 
bus nding is of interest since the results 

R e previous study (Pishkin, Wolfgang, 

asmussen, 1967) revealed that for 10 to 
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Fic. 1. Mean errors as a function of right, 
wrong, or right-wrong cues with 48 subjects in each 
condition. 
18 year olds, performance is facilitated by 
availability of right and right-wrong in- 
stances. Apparently the young children 
tested in this study were unable to utilize 
information from either wrong or right- 
wrong combinations of past instances. 

Earlier studies (Bourne, et al., 1964; 
Pishkin & Wolfgang, 1965) have shown 
that performance is facilitated by availa- 
bility of information from past trials. Sub- 
jects in the age group tested here were una- 
ble to profit from information about past 
wrong category guesses. It should be noted 
that past right instances provide direct in- 
formation about what the concept is. Past 
wrong instances, to be utilized, must be 
treated as information about what the con- 
cept is but which were previously incor- 
rectly sorted. They were placed in the boxes 
marked W and under their incorrect yes-no 
category. Thus, a subject would have to 
transform the information to utilize it prop- 
erly. 
The significant Sex x Grade interaction 
(Figure 2) shows that male and female first 
graders performed at about the same level, 
but that the female third graders were 
clearly superior to males in their ability to 
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solve the problem (F = 1483, df = 1/96, 
p « .001). A plausible explanation for this 
result may be that teachers of both grades 
were females, and it is possible that the 
overall instruction for these students was 
couched in a somewhat feminine frame- 
work, for example, in terms of the examples 
utilized and general styles of conceptuali- 
zation. Furthermore, the experimenter was 
a female. Equally important is the finding 
that the third-grade males actually per- 
formed worse than the first-grade males. 
This might also have resulted from the use 
of language style that is more salient for 
female students. One can speculate that in 
this situation negative attitudes and poor 
attention span may have developed in the 
male students by the time they reached the 
third grade. While reduced learning ability 
might account for the poor performance of 
the third-grade males, the best guess is that 
motivational and attitudinal factors are of 
greater importance. This view is supported 
by the experimenter’s observations that the 
third-grade males seemed quite distractable 
and approached the task with an attitude 
that the experiment was a type of “sissy 
game.” Another interpretation of this inter- 
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Fig. 2. Mean errors as a function of sex and 
grade; each point represents 36 subjects. 
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Fra. 3. Mean errors as a function of sex and 
type of cue; each point represents 24 subjects. 


action could be made in terms of the differ- 
ences in maturation prominent in 8- to 9- 
year-old males and females. Other experi- 
ments are necessary to investigate the role 
of teachers’ and experimenters’ sex in con- 
cept learning of children at different age 
levels in order to explain the significant Sex 
X Grade interaction. In addition, it is clear 
that females utilized combination of right 
and wrong information to a greater degree 
than males (see Sex x Types of Cue inter- 
action in Figure 3) which is consistent with 
previously reported results by  Pishkin, 
Wolfgang, and Rasmussen (1967). 

The significant interaction of Sex X 
Types of Cues (Figure 3) clearly docu- 
ments the differential facilitating effects of 
right, wrong, and combination of right and 
wrong cues upon performances of male and 
female subjects (F = 6.03, df = 2/96, P 
< .01). Furthermore, it is apparent that 
there was no difference in the performances 
of the two sex groups when only wrong cues 
were provided; however, females were able 
to gain to a greater degree from right and 
tight-wrong cues as compared to males who 
performed progressively worse when the 
wrong cues and combination of right-wrong 
stimuli were introduced. This finding is con- 
sistent with the Pishkin, Wolfgang, and 
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Rasmussen (1967) results. Moreover, this 
Sx x Types of Cue interaction also con- 
tributes to the main effects of sex variable, 
where it is apparent that there is no differ- 
ence between sex groups when only wrong 
cues are provided. 

In accordance with the earlier Pishkin, 
Wolfgang, and Rasmussen (1967) report 
the Number x Types of Cues interaction 
was also reliable (F = 3.86, df = 2/96, p 
< .05) indicating that all subjects ap- 
proached different levels of concept identifi- 
cation performance when varying types of 
available information with one or two cues 
were exposed. Note that in the right condi- 
tion, all available memory information is 
directly usable by the subject. For the 
tight-wrong condition only the right half of 
the information is directly usable, and for 
the wrong condition all information must be 
transformed for use. Thus the right-wrong 
condition is intermediate in its degree of 
difficulty. The fact that female subjects 
were able to derive some information from 
this condition, but not as much as from the 
Tight condition suggests that they are at 
least, capable of separating the directly usa- 
ble (right) information out from the less 
Usable (wrong) information. This amounts 
to a slightly more sophisticated information 
Use strategy than the males employed. Sec- 
ond, this interaction indicates that availa- 
bility of one wrong and/or right and wrong 
a had a less facilitating influence on per- 
ee than the availability of two cues 
rad Figure 4). Thus, with wrong and 
E -wrong available instances, more than 
diit bad instance must be available to fa- 
the a d performance. Equally important is 
ER E, that no other interactions 
du e EE case (Fs « 1), suggesting 
x ina s levels and type of cues are inde- 
eit cH another in concept identifi- 
Aa performance. In addition, it is clear 
vit Er level, number of cues, and sex, as 
Sue he number of cues, were not or- 
üitgrof “al their. effects upon problem 
jects, is particular age group of sub- 
era earlier work of Osler and Kof- 
with 3 5) reveals improved performance 

Inerease in age, the present results 
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Fia. 4. Mean errors as a function of number of 


cues and type of cues; each point represents 24 
subjects. 


show no difference in concept identification 
between first and third graders; this finding 
is most likely due to the fact that the con- 
cept identification problem was a relatively 
simple one with only one bit of irrelevant 
information (form or color) and was not 
designed to isolate any potential informa- 
tion complexity interaction with age. 

It may be concluded that: (a) third- 
grade female subjects were superior to 
males in their overall concept identification 
performance, although with sex groups 
pooled the first and the third graders per- 
formed at the same level; (b) availability 
of two past instances significantly improved 
performance as compared to conditions 
where only one past instance was provided, 
particularly when the past available in- 
stances are wrong or à combination of right 
and wrong; likewise availability of past, 
correctly sorted stimuli significantly facili- 
tated performance whereas availability of 
errors and combination of correct and in- 
correct instances resulted in inferior per- 
formance; (c) female subjects were able to 
profit from their past correct and incorrect 
responses to a greater degree than males; 
furthermore, availability of two instances 
in both the right and the wrong categories 
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produced significantly worse performance of 
both sex groups as compared to conditions 
where only one past correct instance was 
provided. 
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EFFECT OF INSTRUCTIONS AND FORM OF INFORMATIVE 
FEEDBACK ON RETENTION OF MEANINGFUL MATERIAL! 


PERSIS T. STURGES’ 


Chico State College 


Retention was measured as a function of: (a) three forms of feedback 
(a cue which could be used to find the correct alternative, instructions 
to study the correct alternative, and instructions to study correct and 
incorrect alternatives) ; (b) three immediate tests (nothing, recall, rec- 
ognition) ; (c) presence or absence of initial presentation of items; and 
(d) two 7-day retention tests (recall, recognition), With a cue, reten- 
tion was optimal with no immediate test. Following immediate tests, 
instructions to study all alternatives improved recall but not recogni- 
tion. It was concluded that the information retained varies with sub- 
jects’ reactions to feedback and the kind of immediate practice. 


Sturges (1969, 1970) reports evidence 
that superior retention with 24-hr. delay of 
feedback varies with the form of informa- 
tive feedback and the presence and form of 
an immediate test. Her results indicate that 
the effect of delayed feedback depends upon 
(a) stimulus aspects in addition to the cor- 
tect alternative present during feedback, 
and (b) the relevance of these stimuli to the 
retention test. Her findings support the in- 
terpretation that subjects with delayed 
feedback respond to more stimulus aspects 
of informative feedback, thus learning more 
about the item; and that when this informa- 
tion can be used in retention, delayed feed- 
back facilitates retention’ 

According to these findings, it should be 
Possible to improve retention by manipulat- 
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retention f y of these findings. They found that 
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nil Seven, but not when informative feedback 
the correct alternative only. 
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ing instructions and the form of informative 
feedback such that subjects respond to more 
stimulus aspects of informative feedback. 
The present experiment investigated this 
question: Can retention similar to that with 
delayed feedback be obtained by manipu- 
lating subjects’ responses to feedback? 
Thus, no delay intervals were compared. 
The subjects’ reactions to informative feed- 
back were manipulated by the form of in- 
formative feedback and instructions, which 
were designed to compare the effect of (a) 
learning the correct alternative only, (b) 
learning the correct and the incorrect alter- 
natives, and (c) organizing the material. In 
two of the feedback conditions, informative 
feedback consisted of the entire item with 
the correct alternative indicated and sub- 
jects’ reactions to informative feedback were 
manipulated by instructions to (a) study 
the correct alternative or (b) study both the 
correct and the incorrect alternatives. The 
third form of feedback was the same as in 
Sturges (1970). The entire item was pre- 
sented, the correct alternative was not indi- 
cated, but a cue was included which the sub- 
ject could use to find the correct alternative. 
Tt was expected that the cue should promote 
studying the relationships among the units 
of the item and thus better organization of 
the material. 

A second variable investigated was the 
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effect of the initial presentation of each 
item preceding informative feedback. Thus, 
in one condition, subjects were presented 
the material, made a response, and received 
informative feedback. In the other condi- 
tion, subjects received informative feedback 
but with no initial presentation of the ma- 
terial or initial response to the items. 
Learning material, initial presentation for 
the groups receiving this condition, and the 
tests were the same as used by Sturges 
(1970). Also, as in the previous studies, 
three immediate test conditions and two 7- 
day retention tests were compared. 


Merxop 


Design 

Three variables were combined factorially: three 
forms of feedback (Instructions Right, Instructions 
Right Wrong, and Right Wrong Cue); three im- 
mediate test conditions (nothing, recall, and recog- 
nition); and the presence or absence of the initial 
presentation and response preceding informative 
feedback. All subjects had both 7-day recall and 
recognition tests. The subjects were 180 under- 
graduates, fulfilling a course requirement, who 
were randomly assigned with 10 subjects in each 
of the 18 groups. 


Learning Material 


The learning material was a series of 32 multi- 
ple-choice items with a definition as a stem and 
four uncommon English words as alternatives. For 
the two instruction groups, informative feedback 
presented the stem and all alternatives with the 
correct choice underlined, and subjects were in- 
structed to (a) study the correct alternative care- 
fully (Instructions Right) or (b) study the correct 
and all incorrect alternatives carefully (Instructions 
Right Wrong). For the Right Wrong Cue groups, 
informative feedback presented the entire item. 
The correct alternative was not indicated, but a 
cue was included which could be used to find the 
correct alternative. For example, where the stem 
was TO REPEAT SENSELESSLY and the correct 
alternative was VERBIGERATE the cue was 
VERBUM = WORD AS IN VERB; —ATE = 
TO MAKE. For all forms of informative feedback 
the positions of the alternatives were randomly 
different from that on the initial presentation. All 
material was presented on 35-mm slides and sub- 
jects recorded their answers on special devices de- 
signed so that the answer was turned out of view 
immediately. 


Procedure 


For half of the groups, subjects were presented 
the material, made a response, and received infor- 
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TABLE 1 

MEAN Correct, IMMEDIATE TESTS FOR Form or 
FEEDBACK AND PRESENCE OR ABSENCE oF 
INITIAL PRESENTATION 


Group I-RW IR RW-C 
Immediate recall 
No IP 5.30 | 6.00 | 1.80 
IP 10.90 | 8.70 | 10.40 
Immediate recognition 
No IP 25.80 | 27.10 | 25.50 
IP 29.00 | 28.90 | 28.30 


Note—Abbreviations: IP = initial presenta- 
tion; I-RW = instructions right wrong; I-R = 
instructions right; RW-C = right wrong cue. 


mative feedback for the appropriate condition. For 
these groups each item was presented (15 seconds), 
the subject wrote his choice (15 seconds), and the 
series of items was followed immediately by the 
series of informative feedback with the appropriate 
experimental instructions. For the remaining half 
of the groups, subjects received informative feed- 
back with the appropriate instructions but with no 
initial presentation of the material or initial re- 
Sponse to the items. For all groups informative 
feedback for each item was exposed 20 seconds 
with a 10-second rest between slides. The immedi- 
ate test was given immediately after the series of 
informative feedback and all subjects returned 7 
days later for a recall test followed by a recognition 
test. The recognition tests presented the stem 
all alternatives in randomly different positions; 
the recall tests presented the stem only; and on 
both tests each item was exposed for 15 seconds 
with 15 seconds to write the correct alternative. 


RESULTS 
Immediate Tests 


Table 1 presents the mean correct for 
each group for each of the immediate tests; 
and these data were analyzed by analysis 
of variance. Performance was superior for 
subjects who received an initial presenta- 
tion of the material compared with those 
who received informative feedback only 
(F = 2843, df = 1/108, p < .001). Also, 
performance on the recognition test was sU- 
perior to that on the recall test (F — 
688.02, df = 1/108, p < .001). 


Seven-Day Tests 


Table 2 presents the mean correct for 
each group for each of the tests and these 
data were analyzed by analysis of variance. 


| 
| 


INSTRUCTION AND RETENTION 


The effect of form of feedback was divided 
into two orthogonal components; F;—Right 
Wrong Cue versus Instructions Right + In- 
structions Right Wrong, and Fs—Instruc- 
tions Right versus Instructions Right 
Wrong. The effect of the immediate test 
conditions was divided into two compo- 
nents: T,;—the combined tests (Recall + 
Recognition) versus no test, and T,—Im- 
mediate Recall versus Immediate Recogni- 
tion. The orthogonal components of the in- 
teraction between form of informative feed- 
back and immediate test conditions were 
determined on the basis of Sturges’ (1970) 
results: F, X T, ; Fi X To ; Fe X To ; and 
Instructions Right versus Instructions 
Right Wrong for the combined immediate 
tests (Recall + Recognition). 

Seven-day retention was superior for 
those conditions in which delayed feedback 
was superior to immediate feedback in pre- 
vious studies (Sturges, 1970). Retention 
was superior for informative feedback con- 
sisting of a cue when there was no immedi- 
ate test and for informative feedback indi- 
cating the correct alternative when there 
was an immediate test (F = 6.51, df = 
1/162, p < .05). Thus, as in the previous 
studies, when informative feedback pre- 
sented a cue optimal retention occurred 
with no immediate practice, but with other 
forms of informative feedback, optimal re- 
tention required some immediate practice 
following feedback. As on the immediate 
test, overall retention was superior when 
subjects had an initial presentation of the 
Material (F = 14.82, df = 1/162, p < 
i but there was no interaction between 
d Initial presentation and other variables. 

hus, retention was facilitated by two pres- 
am of the material but the effect of 
the other variables did not depend upon an 

i bois presentation preceding informative 
eedback. 
eT effects of the types of tests were the 
bos as found by Sturges (1970). Retention 
x Significantly better on a recognition 
di ; than on a recall test (F = 3452.48, 
follo, 1/162, p < .001) and it was better 
, owing an immediate test than with no 
Immediate test (F — 1692, df — 1/162, 
P < 001). Also, retention was better fol- 
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TABLE 2 
Mean Correct, SEvEN-DAYy RETENTION Tests 
FOR IMMEDIATE Test CONDITIONS, FORM OF 
FEEDBACK, AND PRESENCE OR ABSENCE 
or INITIAL PRESENTATION 


No immediate test; 
No IP A 


IP 
Immediate recall 
No IP 
IP. 
Immediate recog- 
nition 
No IP 
IP 


Note.—Abbreviations: IP = initial presenta- 
tion; I-RW = instructions right wrong; I-R = 
instructions right; RW-C = right wrong cue. 


lowing an immediate recognition test than 
an immediate recall test (F = 5.41, df = 
1/162, p < .05) and this effect was greater 
on the 7-day recognition test than on the 
recall test (F = 19.61, df = 1/162, p < 
001). 

Of primary interest, one component of 
the interaction between form of feedback, 
immediate test condition, and form of re- 
tention test was significant (F = 423, df 
= 1/162, p < .05). For the combined imme- 
diate test conditions, retention on the 7-day 
recognition test was significantly better for 
Instructions Right than for Instructions 
Right Wrong; and on the recall test it was 
superior for Instructions Right Wrong. 
Thus, instructing subjects to respond differ- 
ently at the presentation of informative 
feedback did affect their retention perform- 
ance but this effect depended upon the pres- 
ence of an immediate test as well as the 
form of the 7-day retention test. 


Discussion 


The present findings indicate that reten- 
tion depends upon the way subjects respond 
to informative feedback in combination 
with the kind of immediate practice follow- 
ing informative feedback. The differential 
effects of instructions with different forms 
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of the retention test are consistent, with the 
results of Sturges (1970) in which perform- 
ance on the 7-day recognition test was su- 
perior when informative feedback had pre- 
sented the correct alternative only and on 
the 7-day recall test when informative feed- 
back presented the incorrect in addition to 
the correct alternative. Thus, it appears 
that subjects did respond to informative 
feedback as they were instructed to. 

The finding that the facilitative effect of 
additional information at informative feed- 
back is more marked when retention occurs 
with minimal cues indicates that improved 
retention is not due solely to minimal learn- 
ing or to discrimination among alternatives 
at a recognition level. These findings add 
support to the conclusion that retention of 
the correct alternative is facilitated when 
subjects have had an opportunity to iden- 
tify relationships among the stem, the cor- 
rect, and the incorrect alternatives, that is, 
to organize the units of the item in a way 


STURGES 


similar to that found with free recall of in- 
dividual words (e.g., Mandler, 1967). Also, 
additional information at informative feed- 
back is more facilitative when retention og- 
curs with minimal cues. 
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ENVIRONMENT, SOCIAL CLASS, AND MENTAL ABILITIES 


KEVIN MARJORIBANKS* 


University of 


Ozford, England 


À new measure of the learning environment of the home was developed 
in order to examine the relationship between the environment of chil- 
dren and mental ability test performance. One hundred and eighty-five 


11-year-old boys and their parents 


formed the sample. The environment 


' measure accounted for a large percentage of the variance in verbal, 
number, and total ability scores and a moderate percentage of the 
variance in reasoning ability scores. For spatial ability the rela- 
tionship with the environment was less definite. The environment 


measure accounted for more of 


the variance in the ability scores 


than did a set of social status indicators and family structure vari- 


ables, 


Much of the research that has investi- 


gated the relationship between the enyiron- 
mental background of children and iflllec- 
tual test performance has been limited by 
the inadequate measures of both the envi- 
Tonment and student performance that have 
been used. 

The environment has generally been de- 

fined in terms of social status characteris- 
tics such as the occupation and education of 
the Parents or in terms of family structure 
variables such as family size and the 
crowding ratio of the home. For intellectual 
ability a global IQ score is often the only 
Measure examined. 
, In an attempt to overcome the shortcom- 
ings of many of the existing environmental 
studies, it was decided to examine the rela- 
Mosi between a refined measure of the 
earning environment of the home and a set 
of mental ability test scores. 


MeErtHop 
Mental Abilities 


b: Four mental abilities were examined: verbal, 
By Spatial and reasoning. These abilities were 
Perationalized by the scores on the relevant SRA 
"mary Mental Abilities subtests (1962, Rev. ed.). 


6 test also provides a general measure of intelli- * 


Eiceas well as the multifactored scores. 


1 
Requests for repri i 
SS prints should be sent to Kevin 

Usrionbanks, Department of Educational Studies, 

ford ersity of Oxford, 15 Norham Gardens, Ox- 
OX2 6PY, England. 


Environment 

The total environment which surrounds an in- 
dividual may be defined as being composed of a 
complex network of forces. It was assumed that a 
subset of the total network of forces is related to 
each human characteristic (Bloom, 1964). Thus 
for verbal, number, spatial, and reasoning ability, 
it was proposed that subenvironments or subsets 
of environmental forces could be identified which 
would be related to each of the abilities. 

The union of the four subenvironments, which 
were postulated to be related to the four mental 
abilities, was defined as the learning environment. 
This learning environment may be present in the 
home, school, and community, Of these, the home 
produces the first and perhaps the most insistent 
and subtle influence on the mental ability develop- 
ment of the child. As a result, the home was chosen 
as the focus of the study. 

From a review of relevant theoretical and em- 
pirical literature (Coleman, 1966; Dave, 1963; 
Plowden, 1967; Vernon, 1969; Weiss, 1969; Wolf, 
1964) a set of eight environmental forces was iden- 
tified. Subsets of these forces were postulated to 


be related to the mental abilities. = . 


The forces were labeled: 

press for achievement 
press for activeness — . 
press for intellectuality 
press for independence 
press for English 

. press for ethlanguage* 
mother dominance 

. father dominance. 


RET 


Each of the environmental forces was defined in 


3 Bthlanguage refers to any language spoken in 
the home other than English. 
103 
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TABLE 1 


ENVIRONMENTAL Forces AND THEIR RELATED ENVIRONMENTAL CHARACTERISTICS USED IN 


THE INTERVIEW SCHEDULE 


Environmental forces 


Environmental characteristics 


la. 
lb. 
le. 
ld. 
le. 
1f. 
lg. 
2a. 
2b. 
2o. 
3. Press for intellectuality is 

, 3b. 


1. Press for achievement 


2. Press for activeness 


Parental expectations for the education of the child. 

Social press. 

Parent's own aspirations. 

Preparation and planning for child's education. 

Knowledge of child's educational progress. 

Valuing educational aecomplishments. 

Parental interest in school. 

Extent and content of indoor activities 

Extent and content of outdoor activities. ? 

Extent and the purpose of the use of T.V. and other media. 

Number of thought provoking activities engaged in by children, 

Opportunities made available for thought provoking discussions 
and thinking. 


3c. 
4a. 
4b. 
ba. 
5b. 
6a. 
6b. 
Ta. 
Tb. 
8a. 
8b. 


4. Press for independence 
5. Press for English 

6. Press for ethlanguage 
7. Fathér dominance 


8. Mother dominance 


Use of books, periodicals, and other literature. . 
Freedom and encouragement to explore the environment. 
Stress on early independence. 

Language (English) use and reinforcement. 
Opportunities available for language (English) usage. 
Ethlanguage usage and reinforcement. 

Opportunities available for ethlanguage usage. 

Father's involvement in child's activities. 

Father's role in family decision making. 

Mother's involvement in child's activities. 

Mother's role in family decision making. 


mercem ee che fee Se M Uu ADEA Mm le og 


terms of a set of environmental characteristics 
Which were assumed to be the behavioral manifes- 
tations of the environmental forces. 

A list of the environmental forces and their as- 

sociated characteristics are presented in Table 1. 

. The environmental characteristics which are 
listed in Table 1 provided the framework for the 
construction of a new environmental measure. Ini- 
tially, two instruments were developed which elic- 
ited responses from students regarding their home 
environments. In the first questionnaire responses 
were of a true-false form, while in the second ques- 
tionnaire a number of alternate responses were 
provided for each question. Although moderate 
relationships were found between the learning en- 
vironment measures and a number of mental abil- 
ity test scores, it was considered that such data 
included misperceptions and an absence of infor- 
mation regarding many of the environmental 
forces of the learning environment. It was also 
considered that these limitations would be greater 
if the questionnaires were administered to very 
young children, 

Finally, it was decided to develop a semistruc- 
tured home interview schedule which could be used 
to elicit responses from both mothers and fathers, 
Because of the complexity and subtlety of the en- 
vironmental forces that were to be measured, it 
was considered desirable not to limit the inter- 
viewers and respondents to a completely fixed- 
alternative item schedule. It was also considered 


thata completely open-ended item schedule might 
reduce ihe reliability of the instrument. In the 
final instrument, a set of alternate, responses’ was 
supplied for each item. In addition, an “other an- 
swer” space was provided so that the interviewer 
could record a response that was not covered by 
those supplied. j 

A 6-point rating scale was developed in order to 
score each item in the schedule. The score for ea 
of the environmental characteristics was obtain 
by summing the scores on the relevant environ- 
mental items, and the score for each of the en- 
vironmental forces was obtained by summing the 
scores on the relevant environmental characteris- 
ties. 

As well as measuring the intensity of the present 
learning environment, the schedule attempted to 
gain a measure of the cumulative nature of the 
learning environment over time. For example, on 
well as asking, how much schooling do you expec’ 
him to receive?, the schedule also asked, hom 
long have you had these ideas about the amoun’ 
of schooling you expect him to receive? For e f 
such question, six possible answers were provide 
on the schedule for the guidance of the interviewer": 

Three preliminary tests of the schedule v 
made before the final questionnaire was adopted. 


Sample 


i d, 
Approximately 500 11-year-old boys were tested, 
using first the California Test of Mental Maturity, 
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and then the SRA Primary Mental Abilities Test 
(1962, Rev. ed.). The first test-taking situation 
was used to (a) establish examiner-examinee Tap- 
port, (b) ensure that all the children were able to 
understand the test instructions, and (c) establish 
as far as possible uniform test-taking situations. 
The boys were assigned to two categories, one 
dassified as middle class and the other as low 
class, The social-class classification was based on 
an equally weighted combination of the occupation 
of the head of the household and & rating of his 
(or her) education. As far as possible, two parallel 
pools of boys were formed. The purpose of the 
substitute pool was to provide a set of alternate 
families which could be used in the study if fami- 
lies from the first pool did not agree to participate. 

The final sample consisted of 90 boys and their 
parents, classified as middle class, and 95 classified 
as low class. Both parents from each family par- 
ticipated in the interviewing sessions. Each inter- 
view lasted for approximately 2 hours. 


. Hypotheses 


Tn the @velopment of the study it was postu- 
lated that subsets of environmental forces could be 
identified which would be related to each of the 
mental abilities. 

peepee the following hypothesis was investi- 
gated. 


Hypothesis 1: The verbal, number, spatial, 
reasoning, and total ability 
test scores are significantly re- 
lated to scores of environmen- 
tal forces. 


Tt was also proposed that the use of environ- 
mental forces was a means of moving beyond the 
use of gross classificatory variables such as social 
Pss factors (occupation of father, education of 
Mr education of mother) and family structure 
: aracteristics (size of family, ordinal position in 
family, crowding ratio of home) as measures of the 
environment. The advantage of using the suben- 
vironmental approach was investigated by examin- 
ing the following hypothesis. 


Hypothesis 2: Scores on the environmental 
forces are more highly related 
to measures of verbal, number, 
spatial, and reasoning ability 
than are other environmental 
measures such as social-status 
indicators and family structure 
characteristics. 


RESULTS 


Ms examining the hypotheses the reli- 
ility coefficent of each of the environmen- 
bs es was estimated by evaluating 
-celficient alpha (Nunnally, 1967). The re- 


liability coefficients are reported in Table 2. 
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TABLE 2 


RELIABILITY COEFFICIENTS OF THE 
ENVIRONMENTAL SCALES 


TN liability |Number| sp of 

Cat | items | fov 

Press for achievement +94 50 | 35.18 
Press for intellectuality .88 18 | 17.05 
Press for activeness .80 25 | 11.29 
Press for independence 7 16 8.72 
Press for English .93 20 | 17.88 
Press for ethlanguage .90 15 | 14.40 
Father dominance .67 22 9.22 
Mother dominance .66 22 | 10.33 


It was considered that the reliability 
coefficients obtained in Table 2 were of an 
acceptable level. 

The first analysis of Hypothesis 1 in- 
volved an examination of the zero-order 
correlations between the scores of the men- 
tal ability tests and the environmental force 
scores. These correlations are presented in 
Table 3. 

The results in Table 3 indicated that 
most of the relationships were statistically 
significant. For spatial ability the lack of a 
relationship with press for independence 
and father dominance scores is related to 
the inconsistent results found by Vernon, 
who, from his analysis of cultural groups, 
indicated that “there was only limited sup- 
port for the hypothesis that masculine dom- 
inance in the home and encouragement of 
initiative are associated with perceptual- 
spatial abilities [1969, p 222]." 

A further examination of the relationship 
between the learning environment of the 
home and each mental ability was made by 
computing the multiple correlation between 
the eight environmental forces and each 
mental ability. In this analysis, the envi- 
ronmental forces formed a predictor set and 
the mental abilities formed the criterion 
vectors. The results of this analysis are pre- 
sented in Table 4. 

The results in Table 4 indicated that 
when the environmental forces were com- 
bined into a set of predietors they ac- 
counted for a large percentage of the vari- 
ance in verbal and number ability test 
scores, and a moderate percentage of the 
variance in the reasoning ability test scores. 
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TABLE 3 
INTERRELATIONSHIPS BETWEEN THE MENTAL ABILITY TEST SconEs AND 
Scores or THE ENVIRONMENTAL FORCES 
Ability 
Environmental factor 
Verbal Number Spatial Reasoning Total 
DE a e ————————— a WM PTT : 
Press for achievement -66** .66** .28** .89** .69** 
Press for activeness .52** .Al** .22** .20** E iid 
Press for intellectuality .61** .59** .20** .31** .59** 
Press for independence -42%* .g4** .10 .28** .98** 
Press for English .50** .27** .18** .28** -40** 
Press for ethlanguage .95** .24** .09 tT ed .28** 
Father dominance .16* -10 .09 B =U) 
Mother dominance .21** .16* .04 .10 .16 


*p»«.05. 

Tp 01, 

For spatial ability, the corrected multiple 
correlation coefficient did not reach statisti- 
cal significance. These results are suppor- 
tive of Cattell’s theory (Cattell, 1963; Cat- 
tell & Butcher, 1968) which proposes that 
verbal, number, and reasoning abilities 
(erystallized abilities) depend more on en- 
vironmental factors than does spatial abil- 
ity (fluid ability). 

The investigations of the zero-order and 
multiple correlations between the environ- 
ment and mental abilities indicated that, in 
general the environmental scales had 
(a) moderate to high concurrent validity 
in relation to verbal and number abilities, 
(b) low to moderate concurrent va- 
lidity for reasoning ability, and (c) 


TABLE 4 


MULTIPLE CORRELATIONS or Eacan or THE 
MENTAL ÅBILITY SCORES WITH THE ErcaT 
Environment Forces 


a a a 


wu, | epu | Sees | “eat 
Re Re 
Verbal .T2** abis 50.4** 
Number .72** 1*5 50.4** 
Spatial .82* .26 6.7 
Reasoning .49** .40** 16.0** 
Total E27. -72** | 51.8%% 


*» R, refers to the multiple correlation corrected 
to allow for cumulative errors in multiple E, and 
for small sample size. 

*p < Ol. 

** 5 < .001. 


low to negligible concurrent validity for | 
spatial ability. 

The relationship that was found between 
the environmental force scores and the total 
ability score replicates, in part, the studies 
conducted by Dave (1963), Wolf (1964), 
and Dyer (1967). Working with the same 
sample of white fifth-grade American chil- 
dren, both Dave and Wolf found that their 
measure of the environment accounted for 
approximately 50% of the variance in 
global intelligence test scores. Dyer, who 
examined fifth-grade West Indian children, 
found that the environment scores at- 
counted for 46% of the variance in global 
intelligence test scores. In this present 
study it was found that 52% of the variance 
in a global intelligence test score could be 
attributed to the measured environmental 
forces. i 

The relationship between the environ- 
ment and the global ability test scores ob- 
scures, however, important relations be- 
tween the environment and intellectual pet 
formance. In the present study the relation- 
ship between the environment and the ver- 
bal-educational abilities was found to be 
similar to the relationship between the a 
vironment and the global intelligence te 
scores. The relationship between the reason- 
ing and spatial ability scores with the envi- 
Tonment, however, was of a much lower 
order. These results indicate some of the 
limitations of using only global meee 
intellectual ability in environmental stud- 
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ies. In relation to Hypothesis 2, scores on 
the environmental forces are more highly 
related to measures of verbal, number, spa- 
tial, and reasoning ability than are other 
environmental measures such as social sta- 
tus indicators and family structure varia- 
bles. 

, In Table 5 the zero-order interrela- 
tionships between the gross classificatory 
measures of the environment and each of 
the mental abilities are presented. 

In general, the results that are reported 
in Table 5 support findings from previous 
research which have examined the relation- 
ship between global measures of the envi- 
ronment and intelligence test scores. It has 
been found that typical relationships be- 
tween social-status characteristics and abil- 
ity scores are represented by a correlations 
coefficient of .35 (Ausubel, 1968). In the 
present study the average correlation was 
38. For family structure characteristios, 
moderate relationships have been found be- 
tween family size and intelligence test per- 
formance. Fraser (1959) found a relation of 
—4, Whiteman and Deutsch (1968), a rela- 

k tion of —.24, and Nisbet (1953) found rela- 
tions of —.19 on nonverbal tests and —.33 on 
verbal tests. Similarly for the crowding 
Tatio of the home and the ordinal position 
in the family, studies have found moderate 
relationships -with verbal-educational abil- 
ity scores. The present study replicated 
these results and also indicated a lack of a 
relationship with the spatial and reasoning 
ability scores. 
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Overall, the results indicated that the so- 
cial status characteristics had a moderate 
relationship with the mental ability scores, 
and that the family structure characteris- 
tics, while having a moderate relationship 
with the verbal-educational ability scores, 
had a negligible relationship with the spa- 
tial and reasoning ability scores. 

A qualitative inspection of Tables 3 and 
5 indicated that, in general, the environ- 
mental force scores were more highly re- 
lated to the mental ability scores than were 
the gross indicators of the environment. 

In order to compare quantitatively the 
effectiveness of the environmental force 
scores, and the gross indicators as predic- 
tors of mental ability test scores, a set of 
multiple correlation analyses were con- 
ducted. In these analyses the amount of 
variance that could be attributed to the en- 
vironmental forces was computed after ac- 
counting for the variance that could be at- 
tributed to the gross indicators of the envi- 
ronment. The results of these analyses are 
presented in Table 6. 

The results in Table 6 indicated that the 
environmental forces accounted for 25% of 
the variance in verbal ability test scores, 
34% of the variance in number ability 
scores, and 12% of the variance in reason- 
ing ability scores after the variance due to 
the combination of status characteristics 
(occupation of father, education of father, 
education of mother, number of children, 
ordinal position, and crowding ratio) had 
been allowed for. For the spatial ability 


TABLE 5 


INTERRELATIONSHIPS BETWEEN Gross INDICA 


TORS OF THE ENVIRONMENT AND 


MENTAL ABILITY Test SCORES 


Abilities 
Gross indicato 
en Verbal | Number | Spatial | Reasoning | — Total 
Sprint pales cen sco PAREN 
Edueation of father .29** m P den pu 
dueation of mother .8o* po ae i pri 
Occupation óf father -43** 307 En — "08 — 31" 
Number of children in family A Eie quaii Ae ogo N irage 
wding ratio : rt et ; uM 
z ; 
P< 05. 
“p< 0l. 
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TABLE 6 
RELATIONSHIP BETWEEN MENTAL ABILITIES, ENVIRONMENTAL FORCES AND 
Gross InpicaTors OF THE ENVIRONMENT 


RA a A AR pr es T DEM 


Computed Corrected 
Criterion Predictor variables correlation ues 
Verbal ability A = 6 status variables + 8 environmental | .74*** .71*** |. 51.0t** 
forces 
B = 6 status variables .53*** ,D1ees 20.0*** 
C = A — B = environment 25.0*** 
Number ability A = 6 status variables + 8 environmental | .72*** .71*** | 50.0%% 
forces 
B = 6 status variables .42*** .40*** 16.0*** 
C = A — B = environment 34.0%" 
Spatial ability A = 6 status variables + 8 environmental | .38** -36* 13.0* 
forces 
B = 6 status variables .91** .28* 8.0* 
C = A — B = environment 5.0 
Reasoning ability | A = 6 status variables + 8 environmental 7596 .42** 18.0** 
forces 
B = 6 status variables .29"* .25 6.0 
C = A — B = environment 12.0" 
Total ability A = 6 status variables + 8 environmental | .78*** | .75*** | 56.0" 
forces: 
B = 6 status variables .56*** .5a*** | 28. uos, 
C = A — B = environment 28,0** 


test scores, the corrected multiple correla- 
tion for environment" did not reach statis- 
tical significance. Thus the results provided 
support for the general acceptance of Hy- 
pothesis 2. 


Discussion 


The results that were obtained from an 
examination of the relationships among the 
constructs of environmental forces, global 
environmental measures, and mental abili- 
ties provided support for the use of the sub- 
environment approach in the study of intel- 
lectual performance. 

The ex post facto design of the study re- 
stricted, of course, the inferences that could 
be made about causation among the varia- 
bles. Also the examination of the relation- 
ship between the environment and the men- 
tal abilities did not extricate genetic influ- 
ences from “pure” environmental influences, 


The disentanglement of these two influ- 
ences would have required the study of chil- 
dren in which genotypes and environments 
were uncorrelated. Thus, some of the vari- 
ance in the mental ability test scores that 
was attributed to the environment may, ? 
fact, have a genetic base. However, the in- 
strument which was developed for the study 
could be used in research that might at- 
tempt to extricate genetic from pure envi- 
ronmental forces. The study suggests that 
only through more rigorous research in this 
area of investigation and more rigorous at- 
tention to alternative explanations can We 
begin to understand the complexity of envi- 
ronment-organism interactions. 
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EFFECTS OF PERCEPTUAL-MOTOR TRAINING AND 
MANUSCRIPT WRITING ON READING READINESS 
SKILLS IN KINDERGARTEN' 


WALTER B. PRYZWANSKY? 


Teachers College, Columbia University 


Three training pro 


emphasizing fine-motor skill development 


grams 
were conducted for 15 minute/day periods for 12 weeks. Studying ex- 


perimental and control classes (n — 


559) in six schools, the posttest 


performance on the Gates-MacGinitie Readiness Skills Test and two 
measures of visual discrimination were analyzed according to an analy- 
sis of covariance design. Fine-motor activities, regardless of content, 
produced no transfer effect in terms of the criterion measures, How- 
ever, the fine-motor program which had letters of the alphabet as 
its content significantly improved posttraining readiness scores (p < 
001), while no differences were noted in the tests of visual discrimi- 


nation skill of the 
cluded a consideration 
development. 


experimental group. The discussion of results in- 
of the practical implications for curriculum 


Sequenced experiences relating to the in- 
tegration of perceptual-motor functioning 
are believed to contribute to children’s be- 
ginning school success; for example, the 
complex form discrimination task encoun- 
tered in reading may be one of the skills 
that is affected as a result of the visual 
discrimination tasks required in readiness 
exercises. Some concern, however, has been 
voiced in the literature regarding the value 
of perceptual-motor training which is based 
on the transfer effect to reading skills of 
skills gained from practice with material 
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North Carolina 27514. 


consisting not of letters but of objects or 
geometric shapes. Such concern has contrib- 
uted to the intuitive conclusion that prac- 
tice on the academic material to be affected 
by training would appear to have a more 
significant educational influence. 

A review of studies investigating the ef- 
fect of perceptual-motor training is ham- 
pered by a number of factors which make 
comparison of the reported results difficult. 
Some studies evaluate activities which em- 
phasize either the development of fine- of 
gross-motor skills; others emphasize both 
types of exercises. Analysis of the data re- 
quires the recognition of these varying em- 
phases and of the fact that the research 
designs are dissimilar. The research litera 
ture suggests that training on fine-motor €x- 
ercises which use geometric or nonletterlike 
figures contributes little to the development 
of skills initially required in the areas 0 
reading and writing (Cohen, 1966; Linn, 
1968; Rosen, 1966). While it may be argued 
that improvement in specific perceptual- 
motor skills which are fundamental to the 
learning of school skills is the main purpose 
of the training, only when factors such 8$ 
the concept of form and learning set at? 
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involved in the training is any transfer ef- 
fect demonstrated (Bosworth, 1967). It 
would also appear that not only is “verbal- 
visual" discrimination a better predictor of 
first-grade reading achievement (Barrett, 
1965) but also that early knowledge of let- 
ters and sounds is causally related to read- 
ing achievement (Chall, 1967). Durrell’s 
(1958) finding that beginning first-grade 
children match letters with ease and the 
work of Gibson, Gibson, Pick, and Osser 
(1962) designed to study the development 
of the ability to discriminate visually also 
lend support to the argument that training 
directed to the significant attributes of the 
forms to be learned, that is alphabet letters, 
holds greater potential transfer value than 
the typical matching tasks found in readi- 
ness material. 

While a number of visual discrimination 
studies have been reported (particularly 
with reference to eliminating left-right re- 
versals), there has been little empirical evi- 
dence evaluating the effectiveness of form 
discrimination versus reproduction training 
on the ability to make visual discrimina- 
tions. Williams (1968) found that kinder- 
garten children receiving discrimination 
training in which the stimuli to be matched 
were transformations (right-left and up- 
down reversals) performed significantly 
better on a series of visual discrimination 
tests than children who either spent a com- 
parable amount of time tracing and copying 
the standards or received simple discrimi- 
nation training. However, Williams does 
suggest from studies in progress that repro- 
duction training may be more beneficial for 
Young children. 

Considerable attention has been directed 
to children’s ability to reproduce geometric 
forms, because of the relevance it may bear 
ki School performance. Correlations be- 
ween young children’s functioning on form 
Copying tasks and reading readiness or 
Teading achievement test scores range from 
40 vw -70 (Beery, 1967). Consequently, such 
tual gs support the notion that percep- 
E -motor skill needs to be given serious 

vig in curriculum development. 

taditionally, it would appear that for- 
mal handwriting instruction has received 
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little attention in the kindergarten program 
except to highlight the need to prevent poor 
habits from developing which may interfere 
with later proficiency. Connell (1968) re- 
lated the potential success of teaching kin- 
dergarten children to print, but, as with the 
descriptions of others (Ashton-Warner, 
1963), no statistical data are reported. In 
view of the questionable contribution of 
readiness activities such as matching, repro- 
ducing nonletter forms, or other such fine- 
motor exercises, the comparable effect of 
manuscript type training would appear to 
merit investigation. 

The purpose of this study was to investi- 
gate the effects of various perceptual-motor 
training programs and manuscript training 
on kindergarten children’s test scores in the 
area of reading readiness. Two hypotheses 
were posed: (a) perceptual training pro- 
grams emphasizing fine-motor exercises, in- 
cluding manuscript training, assist kinder- 
garten children in making higher scores on 
tests of readiness skills and word discrimi- 
nation ability compared to pupils who do 
not receive such training; and (b) learning 
to reproduce letters of the alphabet is ex- 
pected to produce greater gains among kin- 
dergarten children’s readiness skills test 
scores, as well as word discriminatory abil- 
ity, than either fine-motor programs or sim- 
ilar type activities included in a regular 
kindergarten curriculum. 

MzrHOD 
Subjects 

Six schools in the suburbs of a large metropoli- 
tan city in Pennsylvania participated in judy. 
They represented four porodi f distint naf aala 
p.e pus ee ed A school by bus. 
Table 1 presents the 


the schools. 
schools and 
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TABLE 1 
DISTRIBUTIONS BY ÅGE, SOCIOECONOMIC STATUS, AND SEX 
da dea) SES* Sex 
School " 

M SD M SD Boys Girls 
Control 1 72 68.7 5.0 63.1 20.8 41 31 
Control 2 81 66.5 3.9 61.6^ 20.2 45 36 
Control 3 89 65.2 3.4 80.2 16.0 49 40 
Frostig 130 66.1 6.7 80.5 20.2 59 7 
Manuscript 104 67.0 3.8 78.1 15.0 49 55 
Template 83 67.2 3.8 76.1 19.0 55 28 


“United States Bureau of the Census. Methodology and Scores 


of Socioeconomic Status. (Working 


Paper No. 15) Washington, D. C.: Bureau of the Census, 1963. 


> Data available for only 45 students. 


those of Control Schools 1 and 2. Also, the 
Template school was found to have a 2:1 ratio in 
favor of boys. Although no explanation could be 
determined for this distribution, pretest and post- 
total weighted scores for boys and girls in this 
experimental school were not influenced by the 
treatment (the pre- and posttest means for boys 
were 61,6 and 64.0 while the girls pre- and posttest 
means were 71.9 and 74.6). 


Organizational and Curriculum of the 
Schools 


Only the kindergarten teachers were involved 
in each of the six schools. One Winter Haven 
school teacher terminated employment during the 
course of the study and her replacement received 
the same orientation as the original teachers. 

Half-day schedules were followed in all the 
kindergartens so that both morning and afternoon 
classes were included. The Frostig school employed 
a team teaching approach in the spring, while self- 
contained classes were operating’ in the other 
schools. 

Classroom observations and inspection of six 
lesson plans of each teacher in the study suggested 
that the curriculum programs in each school were 
comparable. Five of the schools employed readiness 
materials from a basal reading program; Control 
School 2 made limited use of the Continental Press 
material in place of a reading readiness text. 


Design 


Three commercially ayailable perceptual train- 
ing programs were selected for study: (a) Tem- 
plate Training (Sutphin, 1964), (b) The Frostig 
Development Book of Visual Perception, Inter- 
mediate Level (Frostig & Home, 1966), and (c) 
Peterson Handwriting System (Peterson, Minister, 
& Enstrom, 1961). All three perceptual programs 
emphasize the development of fine-motor skills 
required in paper-and-pencil tasks, Training is 
done at the child’s desk with the exception of 
some of the Template training exercises at the 
blackboard. While sensory-motor training involy- 
ing gross-motor movements was also stated as very 


important to the optimal effectiveness of the 
Template training and Frostig approaches, these 
exercises were not included as part of the training 
in this study. Otherwise, instruction followed the 
procedures outlined in the teacher's manual. In 
addition to the Template training, the develop- 
mental activities for drawing and copying suggested 
by Kephart (1960) and those activities outlined by 
Bosworth (1967) were incorporated into that pro- 
Bram; these procedures were varied for group 
presentation when necessary. At times certain sub- 
ject matter exercises were expanded for they con- 
tained the core of what was required, for example, 
matching shapes. As recommended in the manual, 
approximately eight pages of the Frostig exercises 
were completed each week. Finally, the introduc 
tion of the letters in the manuscript program fol- 
lowed the outline provided by the Peterson Com- 
pany and included the verbal instructional cues 
used to facilitate a writing rhythm. Teacher dem- 
onstration was done from an easel with a sample 
of the letter being taught written on the child's 
paper to serve as an additional model. Children 
were encouraged to write the letter(s) many times. 

The amount and direction of prestudy interest 
shown by each of the experimental school staffs in 
only one of the perceptual training programs re 
sulted in a natural determination of which pro- 
gram they implemented. As a result, the design 
of the study should neutralize any existing Haw- 
thorne effect since interest and commitment in 
the respective programs had already been get 
erated by the teachers and principal. Comparable 
test results from the kindergarten, second, au 
fourth grades in the experimental schools wer? 
used in identifying control schools. The finezmoto, 
training was begun in February and lasted ! 
weeks; 15 minutes per day were devoted to the 
exercises. The teachers administered all the tests: 


Procedure 


Reading readiness, The Gates-MacGinitie Read- 
iness Skills Test was administered (in two parts, # 
outlined in the Manual) 1 week prior to the esl 
mencement of the study and during the week th? 
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followed the completion of training. The Gates- 
MacGinitie test consists of eight subtests: Listen- 
ing Comprehension, Auditory Discrimination, Vis- 
val Discrimination, Following Directions, Letter 
Recognition, Visual Motor Coordination, Auditory 
Blending, and Word Recognition. Analysis of co- 
variance was used to evaluate the Gates-MacGini- 
tie mean readiness posttest results to allow for 
adjustments of differences obtained during pre- 
testing. 

Word discrimination. In order to evaluate the 
effect of perceptual training on the word dis- 
crimination ability of kindergarten children, two 
tests were administered during the posttest ses- 
sions. The first test (which followed the administra- 
tion of the first half of the Gates-MacGinitie) 
consisted of 30 items made up for the most part 
of the twelve letterlike forms introduced by Gibson 
et al. (1962). The 12 letters had been developed 
according to a set of rules which they hypothesized 
described the general construction of alphabet 
letters. The examiner chose 11 of those letters and 
added 10 other letterlike forms in order to increase 
the number of individual forms children would 
have to deal with, thus paralleling the demands 
of word discrimination tasks using the alphabet. 
Some of these additional forms were transforma- 
tions of the original 12 letters so that reversal and 
rotational factors were present in the task. Each 
of the final 21 letters was then designated as repre- 
senting a letter of the alphabet, and test items were 
developed according to the items which appeared 
in various word discrimination tasks. The final 
copy of the Letter-Like Forms Visual Discrimina- 
tion Test required the pupils to select the one 
Simulated word of five presented in each item 
which was different. Two sample items were pre- 
sented before starting the test. The range of scores 
was from 1 to 30. 

_ Since the manuscript school would have con- 
siderably more exposure to letters of the alphabet 
it might be hypothesized that the children have 
an advantage on tests of word discrimination 
ability. Consequently, a neutral “word” discrimi- 
hation test was needed. The test of Letter-Like 
Forms Visual Diserimination represented an in- 
Bene which consisted of forms not previously 
neountered by any of the children in the six 
Schools, 

^ fhe second test of word discrimination, Subtest 
Sh aking Visual Discriminations, of the Harrison- 
pru Readiness Profile, was modified so that 
Gp sections were the same for all 30 items and 
POR! cues were eliminated. This subtest required 
"e oe identify the one of four stimulus 
ae ni ich matched ihe sample provided for 
Subte ue Two sample items were presented as in 
ny " The scores ranged from 1 to 30. It was 
alt ollowing the administration of the second 

9f the Gates-MacGinitie. 

fübte ally, the results of the visual discrimination 

5 de included as part of the Gates-MacGinitie 
The ess Skills Test, were also included as a 

asure of word discrimination. 
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RESULTS 


Complete data were available for 94% 
(559/590) of the population participating 
in the prestudy testing sessions. Pretest and 
posttest means for each school along with 
the adjusted posttest means on the Gates- 
MacGinitie test are presented in Table 2. 

The analysis of convariance statistical 
procedure was used to determine if any dif- 
ference existed among the Gates-Mac- 
Ginitie adjusted posttest means. A signifi- 
cant F ratio (F = 14.2, df = 5,552, p < 
.001) was obtained. 

The Scheffé statistical procedure was 
used to test the first hypothesis. An exami- 
nation of Table 3 indicates that the im- 
provement shown by all pupils exposed to 
all three perceptual-motor training pro- 
grams was not statistically significant at a 
.001 level when compared to the scores 
achieved by children in the control kinder- 
garten programs. 

The Scheffé procedure was also used to 
test the second hypothesis. The improve- 
ment shown by the children in the manu- 
script program, with a mean total weighted 
score of 80.4, was found to be significant 
when compared to the mean score of 73.9 
for children receiving the two other percep- 
tual-motor training exercises (Table 3). 
Similarly, when the mean functioning of the 
manuscript group was compared with the 
combined total weighted score for the three 
control schools, a significant difference (p 
< .001) was again found in favor of the 


former (Table 3). 


TABLE 2 
Gares-MacGinitie READINESS SKILLS PRETEST, 
POSTTEST, AND ADJUSTED POSTTEST, TOTAL 
WEIGHTED SCORES FOR CONTROL AND 
EXPERIMENTAL SCHOOLS 


School N Premean | Postmean MI 
Control 1 72 68.2 | 74.8 | 72.1 
Control 2 81 61.3 70.8 73.7 
Control 3 89 66.2 76.1 75.4 
Frostig 130 67.8 | 74.4 | 72.6 
Manuscript 104 64.1 79.6 80.4 
Templates 83 62.8 73.5 75.3 

Total 559 


Note.—Highest total weighted score 96. 
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TABLE 3 
SonxrrÉ TEST ror CONTRAST BETWEEN POSTTEST 
Garzs-MacGiNrTIE MEANS or CONTROL AND 
EXPERIMENTAL ScHOOLS, MANUSCRIPT, 
AND THE Two REMAINING EXPERI- 
MENTAL SCHOOLS, MANUSCRIPT 
AND THE CONTROL SCHOOLS 


GATES- 

Schools compared P 
Controls 1, 2, and 3 versus 73.9 (ns) 
Frostig, Template, and Manuscript 76.1 
Manuscript versus 80.4* 
Frostig and Template 73.9* 
Manuscript versus 80.4* 
Control 1, 2, and 3 73.9* 


* p < 001; F = 551, df = 6. 


As a result of the significant findings 
found in favor of manuscript training, the 
means of each school on subtests of the 
Gates-MacGinitie were scanned to deter- 
mine if gains could be attributed to im- 
provement on particular subtests. The sub- 
tests’ scores of Auditory Discrimination, 
Following Directions, and Auditory Blend- 
ing indicated that this work was rather easy 
for most of the children in all the schools. 
Differences existed for the other subtests, 
but no apparent pattern evolved. Analysis 
of the data was difficult due to ceiling ef- 
fects for certain schools on certain subtests 
which reduced the gain that could be 
shown; further, the small number of subtest, 
items coupled with the large n seemed to 
magnify the actual obtained results. 

By inspection, no observable difference of 
practical significance on the results of the 
posttraining administration of the modified 
Harrison-Stroud Subtest and the Letter- 
Like Form Test of Word Discrimination 
was evident either among or between 
schools. Intercorrelations between the scores 
of both these tests of word discrimination 
ranged from .48 to .70 in the different sam- 
ples. The correlations of each of these two 
tests of word discrimination with the 
Gates-MacGinitie total weighted scores 
show a similar range (.50 to -70). 


Discussion 


The present finding, that the inclusion of 
these three perceptual-motor programs into 
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the kindergarten curriculum did not signifi- 
cantly affect readiness skills as measured 
by the Gates-MacGinitie test, is somewhat 
compatible with the results of previous 
studies. Support for incorporating percep- 
tual training in prereading school programs 
has been based on the success children 
showed on tests of visual-motor integration, 
which often reflects the exact content found 
in the exercises. When the criterion meas- 
ures no longer consist of geometric shapes 
or nonsense forms, no differences were gen- 
erally reported. In instances where reading 
readiness or achievement scores were im- 
proved, the level of significance was based 
on such small actual score discrepancies 
that practical educational implications were 
not implied. This lack of confirmation of 
the benefits to school tasks, which were hy- 
pothesized to accrue following a period of 
perceptual training, suggests that other fat- 
ets of the training may be more important 
in the studies which did report significant 
results. One notion is that the behavior of 
attending to important attributes of form 
represents the crucial element in successful 
learning. Equally important, however, could 
be the cognitive demands involved in m 
quiring the pupil to explain the reasoning 
behind his choice in a matching task. Fe 
nally, a conclusion similar to the one raised 
concerning the value of matching exercises 
using pictures or objects would apply to 
fine-motor activities, that is, practice in 
drawing geometric shapes or related fine- 
motor activities does not seem to contribute 
to the skill of copying letters or making 
more acute letter/word discriminations. 

The hypothesized advantages of percep- 
tual-motor activities were not demonstra 
by this study, and the results of this study 
coupled with the findings noted in the liter- 
ature, then, would seem to raise serious res- 
ervations regarding the adoption of such 
training into the curriculum. Most of the 
children included in this study apparently 
came to school with adequate perceptual- 
motor skills; that training, therefore, waé 
not appropriate. On the other hand, it may 
be that the experiences provided in the reg- 
ular curriculum are sufficient to prepare 
them to handle the tasks which were the 
variables in this study. Many of the exer 
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cises included in perceptual programs have 
been developed through use in the remedia- 
tion of learning problems, and it may be in 
this area of instruction that they will be of 
some benefit. In any event, such paper-and- 
pencil exercises should be included in the 
regular program for all children only after 
consideration of the profit which might ac- 
crue from alternate tasks directly relevant 
to reading and writing. 

The second finding, that manuscript 
training produced greater gains than the 
Template training or Frostig book exercises 
on a test of readiness skills, should not be 
surprising in view of the discussion regard- 
ing the relevance of training tasks to the 
criterion measure. Any deliberations of the 
merits of manuscript training for this age 
child should acknowledge the fact that the 
educational significance of such small dif- 
ferences as was found among the Gates- 
MacGinitie total weighted score means is 
questionable. 

Manuscript training could serve to im- 
prove a number of skills; in addition to de- 
veloping good writing habits, letter discrim- 
ination as well as letter names and sounds 
are involved. Partial confirmation of this 
logic was seen in the gains noted in both a 
Visual discrimination task (letter recogni- 
tion) and a visual-motor coordination task 
in which the completion of alphabetic forms 
was required. It is possible that perceptual 
training which stresses attending to salient 
attributes of forms whether they be geomet- 
ne or alphabetic may produce similar re- 
sults; however, the formal and/or inciden- 
tal learnings attached to training in which 
the content is letters should be considered. 
Also, while visual-motor training involving 
Paper-and-pencil tasks may be required for 
Some children, whether it could be done 
within the context of printing has yet to be 
explored. 

The finding that no difference in scores 
Was observed on three types of word dis- 
ego tests suggests that this skill 
A be highly task related and, therefore, 
Show little transfer effect. An alternative 
explanation would be that skills other than 

e Perceptual ones which are associated 
un „visual discrimination need to be em- 

‘sized in order to produce some generali- 
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zation effect. Perhaps the complexity of 
dealing constantly with labeled letters, in 
sequence, changes the task sufficiently to 
minimize to some degree the importance of 
perceptual factors. It is also possible that 
any one of the experimental groups in this 
study would show a latent advantage in 
reading as a result of their training once 
words become the prime focal point of in- 
terest in the first grade. 
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PARADIGMS FOR INDIVIDUALS AND 
COOPERATIVE PAIRS 
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Selection and reception stimulus presentation paradigms in concept 


attainment were compared for individuals and 
in a 2 X 2 X 3 repeated-measures design 
paradigm (selection or yoked reception), (b) 


cooperative pairs 
with the variables: (a) 
number of persons (in- 


dividual or cooperative pair), and (c) problems (three per individual 


or pair). Performance on 


both cards to solution and proportion of 


untenable hypotheses was superior for the reception paradigm and for 
cooperative pairs. The results of both this study and previous research 
are interpreted as supporting the generalization that performance on 
a more difficult conceptual task is more effective with the reception 


paradigm, while performance on a 
more effective with the selection p: 


less difficult conceptual task is 


aradigm. 


Concept attainment is & problem-solving 
Situation that has been widely used in the 
study of individual cognitive processes. In 
designing an experiment in the area a re- 
searcher constructs a ‘concept universe con- 
sisting of all the instances generated from 
all the possible combinations of a number 
of attributes (e.g., shape, color, ete.), each 
of which has two or more values (e.g., 
triangle or square, red or blue, ete.). The 
experimenter arbitrarily designates a rule 
and a combination of values fulfilling the 
rule as a concept (e.g., triangle and blue for 
a conjunctive rule), and the problem solver 
is instructed to determine what concept has 
been so designated. Two basic stimulus 
presentation procedures or paradigms have 
been used in this situation, the selection 
paradigm and the reception paradigm 
(Bourne, 1966). In the selection paradigm 
all of the instances in the entire concept, 
universe are simultaneously present before 
the problem solver. The experimenter begins 
the problem by indicating an instance that 


* Appreciation is expressed to Joseph P. Reser 
for assistance in the collection of the data. Re- 
quests for reprints should be sent to Patrick R. 


exemplifies the concept. The problem solver 
then selects another instance and is in- 
formed whether or not this selected instance 
also exemplifies the concept (i.e., whether it 
is a positive or negative instance). Next the 
problem solver makes a hypothesis indicat- 
ing what he considers the concept to be, and 
is informed whether or not his hypothesis 18 
correct. This cycle is repeated until he 
meets some predetermined criterion of solu- 
tion such as the correct statement of the 
hypothesis. In the reception paradigm the 
order of instances is preprogrammed by the 
experimenter and presented one at a time to 
the problem solver, who must reason to the 
correct solution from this series of succes 
sively encountered instances. The two para- 
digms are analogous to a student working in 
a library from his own choice of references 
(selection) versus listening to a classroom 
lecture (reception). 

In general, researchers favoring informa 
tion-processing or hypothesis-testing theo- 
ries of conceptual behavior have tended to 
use selection paradigms, while those favor- 
ing stimulus-response or association theo- 
Ties have tended to use reception para- 


i E a i hers have 
Laughlin, Department of Psychology, University digms. However, most researc 1 
of Illinois, Champaign, Illinois 61820. seemed to assume that theory and data ob 
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tained with either paradigm apply directly 
to the other. Thus, in spite of the theoreti- 
cal importance and the obvious educational 
implications of determining the relative dif- 
ficulty of the two paradigms, only a few 
studies have directly compared the two 
stimulus presentation procedures. In order 
to equate the information presented under 
the two paradigms, all of these studies have 
employed a yoking procedure, in which the 
actual sequence of instances chosen by each 
selection subject is subsequently presented 
to a yoked partner under reception condi- 
tions. These studies have obtained conflict- 
ing results: Huttenlocher (1962) and Mur- 
ray and Gregg (1969) found better per- 
formance with the reception paradigm, 
Hunt (1965) reported better performance 
with the selection paradigm, and Lowenk- 
ton and Johnson (1968) found no difference 
between selection and yoked-reception par- 
adigms, Two more extensive factorial stud- 
ies have reported interactions between the 
two paradigms and the difficulty of the 
problems. Schwartz (1966) found that the 
selection paradigm was relatively easier for 
conjunctive concepts, while the reception 
paradigm was relatively easier for disjunc- 
tive concepts. Laughlin (1969) compared 
the two paradigms in a large factorial de- 
Sign with four variables, and found no dif- 
ference in cards to solution but fewer un- 
tenable hypotheses (hypotheses that were 
logically inconsistent with the information 
available to the problem solver) for the re- 
ception paradigm. There was an interaction 
between the two paradigms and the concept 
Universes employed, indicating that the 
euin paradigm was easier on a less dif- 
cult concept universe while the reception 
Paradigm was easier on a more difficult con- 

cept universe, 
"d ry previous research indicates that the 
E lve difficulty of the two paradigms is 
Certain, but suggests that the reception 
ione may be progressively easier rela- 
"a o the selection paradigm as the diffi- 
Wee of the conceptual task increases. As a 
of this suggestion, the following study 
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more widely used conjunctive rule (e.g., 
Giambra, 1969a, 1969b; Haygood & 
Bourne, 1965; Laughlin, 1968, 1969; Laugh- 
lin, McGlynn, Anderson, & Jacobson, 1968; 
Neisser & Weene, 1962). It was predicted 
that the reception paradigm would result in 
both fewer stimulus instances to solution 
and a lower proportion of untenable hy- 
potheses than the selection paradigm. 

This concept-attainment situation has re- 
cently been extended to the comparison of 
individual versus group problem solving. In 
the first such study Laughlin (1965) com- 
pared individuals and cooperative male 
pairs with a selection paradigm. The coop- 
erative pairs solved the problems in fewer 
card choices, made fewer untenable hy- 
potheses, and used the theoretically more 
effective focusing strategy (Bruner, Good- 
now, & Austin, 1956) more than individu- 
als. In order to determine the relative im- 
portance of discussion and memory in this 
superiority, Laughlin and Doherty (1967) 
used a selection paradigm in a factorial de- 
sign in which discussion was or was not al- 
lowed and paper for recording was or was 
not allowed. Pairs allowed discussion solved 
the problems in fewer card choices than 
pairs not allowed discussion, while paper 
had no effect. A comparable study with a 
reception paradigm (Wolfgang, 1967) also 
found that pairs who were allowed discus- 
sion performed better than both pairs who 
were not allowed discussion and individu- 
als, especially on the more complex prob- 
lems. Laughlin and McGlynn (1967) re- 
ported marked superiority for cooperative 
pairs over two competitive individuals with 
a selection paradigm, while a study by 
Laughlin, McGlynn, Anderson, and Jacob- 
son (1968) extended the superiority of co- 
operative pairs over individuals on a selec- 
tion paradigm to eight conceptual rules and 
two memory conditions. Finally, Laughlin, 
Kalowski, Metzler, Ostap, and Venclovas 
(1968) compared individuals and coopera- 
tive pairs on a reception paradigm under 
three stimulus modalities (visual, auditory, 
and mixed visual and auditory) and two 
information conditions. Pairs were superior 
to individuals in all conditions, especially in 
the most difficult visual stimulus modality. 
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Thus, a number of studies with either a 
selection or a reception paradigm alone 
have indieated the relative superiority of 
cooperative pairs over individuals in con- 
cept attainment. However, no previous re- 
search has directly compared individuals 
and cooperative pairs on both selection and 
reception paradigms in the same experi- 
ment. Consequently, the second purpose of 
the experiment was to extend the compari- 
son of selection and reception paradigms to 
individuals and cooperative pairs. It was 
predicted that the cooperative pairs would 
require fewer cards to solution and make a 
lower proportion of untenable hypotheses 
than individuals. In addition, the paradigm 
by number of persons interaction would in- 
dicate whether the relative superiority of 
cooperative pairs over individuals would 
have to be qualified by the particular stim- 
ulus presentation paradigm. 


METHOD 


Design and Subjects 


The experimental design was a 2 X 2 X 3 re- 
peated-measures factorial with the variables: (a) 
stimulus presentation paradigm (selection or yoked 
reception), (b) number of persons (individual or 
cooperative pair), and (c) problems (three per in- 
dividual or pair). The subjects were 90 introduc- 
tory psychology students fulfilling required experi- 
mental participation. Fifteen individuals were 
randomly assigned to each of the two individual 
conditions, and 15 same-sex pairs to each of the 
two pair conditions, 


Stimulus Materials and Problems 


The concept instances consisted of 64 75 x 
12.7 centimeter cards. These 64 cards represented 
all the possible combinations of six plus and/or 
minus signs in a row. Each of the six positions was 
a different color, so that plus or minus was the 
value of each color or attribute. The problems 
were conditional concepts with two relevant attri- 
butes, for example, “if red plus, then blue minus.” 
Three sets of three problems each were randomly 
selected from the total set of possible two-value 
concepts, and the initial cards were randomly se- 
lected from the subset of possible positive initial 
cards for each problem. Within each of the four 
experimental conditions, five individuals or pairs 
received each of the three sets of three problems. 


Procedure and Instructions 


In both selection and reception conditions the 
instructions thoroughly explained the nature of the 
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task, indicating the attributes and. values on the 
stimulus cards, the nature of the conditional con- 
cept rule, and the nature of the feedback to be em. 
ployed on the stimulus cards. In the selection 
conditions the subject(s) sat facing an 8 X 8 

of the 64 cards on a table. After the attributes and 
values and their systematic arrangement (eg., all 
blue plus cards were in the top four rows) were 
pointed out, the nature of a two-attribute condi. 
tional concept was explained with an example, 
The problems began by the experimenter indicat. 
ing an initial positive card and instructing the 
subject(s) that this card contained both of the 


relevant values of the concept. The subject(s) then - 


selected any of the remaining 63 cards, and the 
experimenter gave oral feedback to indicate the 
status of the selected card. There were four possi- 
ble types of feedback: (a) present-present—indi- 
cating that the if factor of the concept was present 
and the then factor was present; (b) present- 
absent—indicating that the if factor was present 
and the then factor was absent, (c) absent-present 
—indicating that the if factor was absent and the 
then factor was present, (d) absent-absent—indi- 
cating that the if factor was absent and the then 
factor was absent. These four types of feedback 
were on a reference card which remained available 
to the subject(s) throughout the experiment. After 
this card selection and feedback the subject(s) 
made one (only) hypothesis and the experimenter 
said yes or no to indicate whether or not the hy- 
pothesis was correct. A yes hypothesis solved the 
problem, while a no hypothesis required another 
card selection, feedback, and hypothesis. This 
cycle of card selection, feedback on the card, one 
hypothesis, and feedback on the hypothesis was 
repeated until the subject(s) made the correct 
hypothesis. In the pair conditions the experiment 
was explained as research in cooperative problem 
solving, and it was emphasized that the two indi- 
viduals were not competing in any way but were 
scored as a unit. Full discussion was allowed 
throughout the experiment, and either person 
cound make any card choice or hypothesis. After 
each selection subject(s) had completed his three 
problems his sequence of cards on each problem 
was then presented to the next same-sex subject(s) 
^ appear under the corresponding reception cot 
tion. 

In reception conditions the problems bee 
with a given positive initial card, which remaine 
present throughout the problem. The experimenter 
then presented the first card that had been chosen 
by the corresponding selection subject(s) and i 
dicated its status (present-present, etc.). The 3 
ject(s) then made a hypothesis and the experimen 
ter said yes or no to indicate whether or not the 
hypothesis was correct. If the hypothesis was nO 
correct the experimenter removed the card the 
had just been presented, presented the second 
chosen by the corresponding selection partneris; 
and indicated its status. This cycle was er 
until the correct hypothesis was given. If the e 
lem was not solved by the last card chosen by 
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yoked selection partner(s), the same sequence of 
cards was repeated. In the pair conditions the 
same instructions emphasizing cooperative prob- 
Jem solving as in the corresponding selection con- 
dition were given. As in the selection condition, full 
diseussion was allowed throughout the experiment, 
and either individual could make any hypothesis. 


RESULTS 


Number of Card Selections or Receptions to 
Solution 


Means for number of card selections or 
receptions to solution for each of the three 
problems and totals over three problems for 
the four treatment conditions are given in 
Table 1. Results of analysis of variance on 
a square root transformation of the number 
of card selections or receptions to solution 
are given in Table 2. (The square root 
transformation was applied to stabilize the 
variances within the four treatment condi- 
tions.) The reception paradigm required 
fewer cards to solution than the selection 
(F = 5.44, df = 1/56, p < .025). There was 
a trend for cooperative pairs to require 
fewer cards to solution than individuals (F 
= 8.78, df = 1/56, p < .10). The paradigm 
by number of persons interaction was not 
significant (F < 1). Neither the main effect 


TABLE 1 


Muan NUMBER or Carp SELECTIONS OR 
RECEPTIONS TO SOLUTION AND 
PROPORTION or UNTENABLE 
HYPOTHESES 


Card selections or receptions to solution 


Selection Reception 
Problem 

Individual Pair  |Individual Pair 

me 3.73 | 3.60 | 2.73 | 3.40 

DS 4.80 | 3.33 | 3.60 | 2.73 

hree 4.13 | 3.47 | 3.80 | 3.13 

Total |12.66 | 10.40 | 10.13 | 9.26 

Proportion of Untenable Hypotheses 

cuero Valena bie ypo a 

Si -313 | .123 | .182 | -100 

The, .383 | .187 | .196 -137 

e -267 | .221 | .150 -072 

Total .963 .531 .598 .309 
SENE diuo i. n Lcd MR n 
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TABLE 2 
ANALYSES OF VARIANCE FOR SQUARE Root TRANS- 
FORMATION OF NUMBER or Carp SELECTIONS 
OR RECEPTIONS TO SOLUTION AND ARCSIN 
TRANSFORMATION OF PROPORTION OF UN- 
TENABLE HYPOTHESES 


Corato Untenable 
ism rot] odi Doni, 
Source df endo: transformation) 
MS| F | MS F 
Between sub- 
jects 
Presentation 1|.9553/5.44** 5.1001, 9.06*** 
paradigm 
(A) 
Number of 1).6633/3.78* |6.5691/11.67*** 
persons (B) 
AXB 1.0454 <1 | .2421| <1 
Error between 56] .1755) -5630 
Within subjects 
Problems (C) 2,.0985| <1 | .2465) <1 
AXC 21.0704] <1 |.2473 <1 
BXC 2).6127/1.96 | .2706 «1 
AXBXC 2.0235] <1 |.1875| <1 
Error within 112} .3123 4444. 
*DsJ0 
** p< 025. 
*** p < 005. 


of successive problems nor any of its inter- 
actions were significant. 


Proportion of Untenable Hypotheses 


Each hypothesis was classified as tenable 
or untenable, that is, logically consistent or 
inconsistent with the available information. 
With the particular six-attribute, two-value 
concept universe and feedback (present- 
present, etc.) of the study, there were eight 
possible types of untenable hypotheses, four 
due to the if factor and four due to the then 
factor of the hypothesis. Untenable hy- 
potheses due to the if factor included hy- 
potheses in which: (a) the opposite value of 
the if factor appeared on a previous pres- 
ent-present instance, (b) the opposite value 
of the if factor appeared on a previous pres- 
ent-absent instance, (c) the if factor ap- 
peared on a previous absent-present in- 
stance, (d) the if factor appeared on a pre- 
vious absent-absent instance. Untenable 
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hypotheses due to the then factor included 
hypotheses in which: (a) the opposite value 
of the then factor appeared on a previous 
present-present instance, (b) the opposite 
value of the then factor appeared on a pre- 
vious absent-present instance, (c) the then 
factor appeared on a previous present-ab- 
sent instance, (d) the then factor appeared 
on a previous absent-absent instance. Thus, 
a hypothesis had to be consistent with both 
the current card and all previously selected 
or received cards to be tenable. The meas- 
ure of untenable hypotheses therefore sub- 
sumed both the perceptual-inference errors 
(responses inconsistent with the current in- 
stance) and memory errors (responses in- 
consistent with amy previous instance al- 
though consistent with the current in- 
stance) of Cahill and Hovland (1960). The 
number of untenable hypotheses on each 
problem was divided by the total number of 
card selections or receptions on that prob- 
lem to obtain the proportion of untenable 
hypotheses. Means for the proportion of un- 
tenable hypotheses are given in Table 1. 
Since one hypothesis was required for each 
card selection or reception, the measure of 
proportion of untenable hypotheses in- 
volved binomial variance due to both the 
number of untenable hypotheses (numera- 
tor) and the total number of hypotheses 
(denominator), and therefore an arcsin 
transformation was applied to the propor- 
tion of untenable hypotheses as recom- 
mended by Winer (1962, p. 221). Results of 
analysis of variance on the aresin transfor- 
mation of proportion of untenable hy- 
potheses are given in Table 2. There was a 
lower proportion of untenable hypotheses 
with the reception paradigm than the selec- 
tion (F = 9.06, df = 1/56, p < .005). 
Cooperative pairs had a lower proportion of 
untenable hypotheses than individuals (F 
= 11.67, df = 1/56, p < .005). The para- 
digm by number of persons interaction was 
not significant (F < 1). Neither the effect 
of successive problems nor any of its inter- 
actions was significant. 


Correlations between Response Measures 


Within each of the three problems for 
each of the four treatment conditions, cor- 
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relations were computed between the num. 
ber of cards to solution (square root trans- 
formation) and the proportion of untenable 
hypotheses (arcsin transformation), These 
12 correlations were converted to z’ values, 
and the mean z' obtained for them. The 
mean z' value was .654, which was recon- 
verted to a mean correlation of .575. This 
moderate correlation indicated that the two 
Tesponse measures corresponded to some- 
what different aspects of performance, espe- 
cially since cooperative pairs differed 
markedly from individuals on the propor- 
tion of untenable hypotheses, but only mar- 
ginally on the number of cards to solution. 


Discussion 


The major purpose of the experiment was 
to compare concept-attainment perform- 
ance on selection and reception stimulus 
presentation paradigms. Performance was 
clearly superior on the reception paradigm, 
both for the basic overall measure of num- 
ber of cards to solution and for the measure 
of the proportion of untenable hypotheses 
or indication of the degree of logical con- 
sistency with which the problems were 
solved. As indicated in the introduction, 
these results were predicted because the 
conceptual task involved the difficult condi- 
tional concept rule and a concept universe 
with six attributes. The two previous stud- 
ies which have found reception paradigms 
to be easier than selection also probably in- 
volved relatively difficult conceptual tasks. 
Huttenlocher (1962) used an eight-attri- 
bute, two-value concept universe, which was 
probably very difficult for her seventh- 
grade subjects. Although the task of Mur 
ray and Gregg (1969) with college students 
involved only four attributes, there wel? 
four values per attribute, and this may 
have resulted in a relatively difficult task. 
The only study to report a clear different? 
in favor of the selection paradigm is that 0 
Hunt (1962), and his particular concep 
task was apparently quite low in diffic "d 
since seven of his nine sélection subje? 
made no errors at all on his classification 
response measure. Further, two previous 
factorial studies have found interactions ™ 
which performance was better on the rece" 
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tion paradigm for a more difficult, concep- 
tual task while it was better on the selec- 
tion paradigm for a less difficult conceptual 
- task (Laughlin, 1969; Schwartz, 1966). 
Thus, both the present study and previous 
research suggest the empirical generaliza- 
tion that the reception paradigm is more 
effective for a more difficult conceptual task 
while the selection paradigm is more effec- 
tive for a less difficult conceptual task. 
Because of the wide variation in the par- 
ticular stimulus materials, feedback condi- 
tions, and response measures in this pre- 
vious research, it is perhaps premature to 
speculate on the underlying reasons for this 
proposed empirical generalization of an in- 
teraction between the stimulus presentation 
paradigm and the difficulty of a conceptual 
task. However, the selection paradigm ac- 
tually involves two processes or stages, & 
decision concerning what stimulus instance 
to select in order to obtain information, and 
an interpretation of the information once it 
is obtained. In contrast, the reception para- 
digm involves only the second process or 
stage, since the order of stimulus instances 
; has already been preprogrammed by the ex- 
perimenter. On the other hand, the selection 
paradigm enables the problem solver to se- 
lect stimulus instances that are maximally 
informative to him in terms of his own ca- 
pacity to interpret information, while the 
teception paradigm limits the problem solver 
ío an interpretation of whatever infor- 
Mation is presented to him. With a concep- 
tual task of low difficulty the advantage of 
being able to select maximally informative 
stimulus instances would thus favor per- 
ormance on the selection paradigm, while 
with a conceptual task of high difficulty the 
demands of stimulus selection in addition to 
Interpretation of the information would re- 
sult in poorer performance on the selection 
Paradigm relative to the reception. This 
Teasoning should be tested in further re- 
Search comparing selection and reception 
Mime for conceptual tasks of differen- 
tion ifisulty, as the proposed generaliza- 
tions ^w important educational implica- 
[^ eas or example, a lecture would be in- 
dii ingly effective relative to independent 
Y methods as the difficulty level of the 
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matter to be learned increased. At a mini- 
mum, the apparent assumption of most re- 
searchers in human conceptual behavior 
that theory and data based on one para- 
digm are equally applicable to the other 
paradigm may have to be qualified by the 
difficulty level of the conceptual task. 

The second purpose of the experiment 
was to extend the comparison of selection 
and reception paradigms to a comparison of 
individuals and cooperative pairs. As indi- 
eated in the introduction, a number of pre- 
vious studies have demonstrated the superi- 
ority of cooperative pairs over individuals 
on both selection and reception paradigms. 
However, no previous research has demon- 
strated the superiority of pairs on both par- 
adigms in the same experiment. In the pres- 
ent study the cooperative pairs had a highly 
significant lower proportion of untenable 
hypotheses, and a trend to fewer cards to 
solution (p « .10) than individuals, thus 
extending the results of previous research to 
both paradigms in the same study. Like- 
wise, the nonsignificant interactions be- 
tween the paradigm and number of persons 
for both response measures indicate that the 
superiority of cooperative pairs over indi- 
viduals does not need to be qualified by the 
partieular stimulus presentation paradigms 
employed in previous research. 
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ACADEMIC PERFORMANCE WITH AND WITHOUT 
KNOWLEDGE OF SCORES ON TESTS OF 

i INTELLIGENCE, APTITUDE, 

AND PERSONALITY: 


A FURTHER STUDY 


ALFRED J. M. FLOOK* ann P. JEANNIE ROBINSON? 
University of Dundee, Scotland 


A previous study of the relationship between academic performance 
and knowledge of test scores was extended and replicated. The 78 
subjects were volunteer first-year social science students at the Uni- 
versity of Dundee, Scotland. They were divided into two matched 
groups: subjects in Group K were given detailed knowledge of their 
test scores; Group NK received no such knowledge. The effects on 
work level, advantage taken of advice on study methods, satisfaction 
with academic life, subjective probability of success, and anxiety were 
explored. In end-of-year examinations, the two groups as a whole did 
not differ significantly in performance, but Group K’s middle section 
performed better than Group NK's (p < .05). The differing results of 
the two studies indicate that the relationship is heavily dependent on 
the nature of the situation and the subjects. There is a particular need 
for further investigation of the effects on subjects with the lowest abil- 


ity scores in their groups. 


In a previous study (Flook & Saggar, 
1968) it was found that freshman engineer- 
Ing students who were given detailed 
knowledge of their test scores (Group K) 
Performed better (p < .001) in end-of-year 
examinations than matched students who 
Were not informed (Group NK). The broad 
Interpretation offered for Group K’s superi- 
ority was that knowledge of test scores had 
a catalytic effect on their subsequent be- 
havior. More precisely, it was suggested 
that such knowledge set up an elaborate 
chain reaction in which Group K (a) clari- 
ow 
à Mine for reprints should be sent to Alfred 
ot Donde. i Department of Psychology, University 
land, ; Postal Code DD1 4HN, Dundee, Scot- 
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fied their relative standing in academic po- 
tential by means of improved self-evalua- 
tion through social comparison; (b) thereby 
achieved a more realistic estimate of their 
probability of success; (c) in consequence 
formed task-relevant motivational resolu- 
tons (to work harder, improve their 
methods of work, etc); and (d) translated 
those intentions into appropriate action 
with marked effects in the end-of-year ex- 
aminations. In short, the interpretation im- 
plied two kinds of consequence to receiving 
knowledge of test scores—a cognitive one, 
involving assessment of the information 
given, and an activating or motivational 
one, leading to facilitating changes in atti- 
tude and behavior. 

Out of this earlier study two sets of ques- 
tions arose which pointed to the need for 
further experimentation. Replication was 
essential to establish what generality can be 
attached to the remarkably clear-cut find- 
ing. Second, the complex social context of 
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academic performance clearly needed de- 
tailed exploration to throw light on what 
goes on that matters in the crucial interval 
between giving, or withholding, test infor- 
mation and eventual examination perform- 
ance. The present study was made, there- 
fore, with the dual purpose of carrying out 
both a replication of the original experi- 
ment with somewhat different subjects and 
also an exploration of the motivational dy- 
namics of the situation. For the sake of 
clarity the relevant hypotheses are de- 
seribed, and the results presented and dis- 
cussed, under the two main headings of 
Replication and Exploration. 


Mernop 


Subjects 


The subjects were 78 volunteers (33 female and 
45 male) from a class of 149 students admitted to 
the Faculty of Social Sciences and Letters in the 
University of Dundee in October 1969. Three of 
the volunteers withdrew from the University dur- 
ing the year and therefore did not sit the degree 
examinations in June 1970, which were used as the 
criterion in this experiment. Consequently they 
and their matched counterparts in the other group 
were all omitted from the analysis of the results, 
The results given and discussed in this article 
relate therefore to 72 subjects (32 female and 40 
male) in two matched groups. As in the previous 
experiment, the subjects were divided into two 
groups, and on this occasion they were matched for 
intelligence, anxiety score, and age, The groups 
were formed in a way which permitted the subdi- 
vision of each into three comparable sections, 
made up of matched pairs of subjects with scores 
in the top, middle, and bottom thirds of the dis- 
tribution of intelligence-test scores, 


Materials 


The tests administered were the AH5 Grou; 
Test of High-Grade Intelligence (AH5; Heim, à 
dated), the Achievement Anxiety Test (AAT; 
Alpert & Haber, 1960), and the Test Anxiety 
Questionnaire (TAQ; Mandler & Sarason, 1952). 
On this occasion no suitable aptitude test was 
available. A general questionnaire, specially de- 
signed to collect data needed in the Exploration 
phase of the study, was also used. Additional in- 
formation was obtained in personal interviews. 


Procedure 


The general situation regarding the purpose of 
the research and the collection of data was ex- 
plained to each subject at the start of the first 
term in a personal letter from the Head of the 
Department of Psychology. In it he made clear 
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that the department was sponsoring the study, that 
participation was voluntary, and that information 
obtained about subjects at all stages of the inquiry 
would be treated in strict confidence. At a Meeting 
of the class a few days later the investigation was 
explained again and questions answered. 

The data were then collected in the following 
stages: 

Group administration of tests and question- 
naire. This was done in three sessions during the 
third week of the first term, except that the TAQ 
was not available in time. When the tests had 
been scored, subjects were divided into two 
matched groups with 39 students in each group, 
These then passed through the stages described 
below, except for the three subjects who did not 
sit the examination. 

Individual interviews. During the fifth and sixth 
weeks of the first term all subjects were given an 
individual interview. First, they were all asked to 
fill in the TAQ. Then those subjects in Group K 
were each given their own scores on the other two 
tests: all, without exception, asked for this infor- 
mation when it was offered. They were given indi- 
vidual scores on AH5, together with enough norma- 
tive data to see how they stood in relation to the 
original standardization groups and to their own 
classmates (i.e., the group of 78 volunteers). Again, 
as in the previous study, a special effort was made 
to present this information in an encouraging way 
and they were also given details of the examination 
failure rate (approximately 8%) for the previous 
year. They were then told their AAT scores and 
the relationship between test anxiety and academic 
performance was discussed. General questions 
about the course and personal problems were dealt 
with, and advice on methods of study was offered 
to each subject. Finally they were asked to answer 
again the same questions on how much work they 
were doing and intended to do in the future whic 
had first been put to them in the questionnaire in 
the group testing session. The other subjects, 
Group NK, went through the same procedure (in- 
cluding the general discussions of the relationship 
of academic performance to intelligence and anxi- 
ety) except that they were not given their indi- 
vidual test scores or the relevant normative dati. 
They were told they could receive that information 
at the end of the session if they wished. Subjects 
were assigned to interviews at random, and the p 
sults of Group NK subjects were not looked 8 
immediately before their interviews so as to avol 
giving them any indication of their results W- 
wittingly. 

Supplementary interview on methods of stud, 
Later, an interview was given to any subject y 
either group who had asked for such advice in the 
first interview. the 

Second group session. In the fifth week of tie 
second term, all “surviving” subjects attended th 
prearranged final group sessions. They comple! 
for a second time the general questionnaire, 
TAQ, and the AAT, and answered two new Qui 
tions on degree of satisfaction with academic life 


| 
| 


INTELLIGENCE, APTITUDE, AND PERSONALITY 


First-year degree examinations. In June 1970, all 
subjects except three took five separate degree 
examinations in history, methodology, economies, 
psychology, and one of political science, or mathe- 
matics, or demography and biology. The marks 
were averaged to form a criterion score, which is 
used in Table 1 to compare the examination per- 
formances of the two groups. 


REPLICATION 


Hypothesis 


Although Group K's performance was su- 
perior in the previous study, neither general 
considerations nor the findings reported in 
the Exploration section yielded an unequiv- 
ocal prediction one way or the other for the 
present experiment. Therefore the null hy- 
pothesis was tested, as in the earlier study, 
by the two-tailed version of tests appropri- 
ate to the matched-pairs experimental de- 
Sign. 


Results 


The outcome is detailed in Table 1. For 
the whole groups the null hypothesis was 
confirmed. Although the difference in scores 
again favored Group K it fell far short of 
Significance. That overall effect was made 
up of considerable variation in the sections. 
Group K’s top and bottom sections did not 
Perform so well as Group NK’s, and al- 
though neither difference was significant 
that for the bottom sections approached the 
05 level. In compensation, the middle sec- 
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tion scores showed a significant difference in 
favor of Group K. These findings are sub- 
stantially different from the previous study 
in which all four differences favored Group 
K and those for the whole group and its 
bottom section reached significance. 
Discussion 

In reviewing the differences between the 
findings of the two studies it is clearly im- 
portant to keep in mind how they differed 
in subjects and situation. The first almost 
exclusively concerned male subjects. They 
were studying engineering and almost the 
entire class took part. The second concerned 
subjects of whom nearly as many were fe- 
male as male. They were studying social 
seiences and only about one half of the 
class took part. The female investigator in 
the first study, being interested in wider 
topics, spent more time with the subjects 
than the one in the second could, and in 
general acted more as a counselor than as 
an experimenter. Most important, perhaps, 
the recent failure rate was about four times 
higher in the first study than in the second, 
a fact which on this occasion may well have 
made concern about failure less acute and 
extensive. 

The virtual equality of the two groups’ 
performance in the present study is in line 
with the findings in the Exploration section, 
where no clear picture in favor of either 


TABLE 1 
Groups K ann NK (AND THEIR SECTIONS) COMPARED ON CRITERION SCORES 
Group K Group NK Comparisons 

Subjects a 

M SD M SD T D 
Whole groups (n = 72) 53.47 4.65 53.19 4,37 329 .26 
Sections: — (in Top 53.92 3.90 54.92 2.50 33 —.59 

each, n = 12) 

Middle 55.75 4.10 50.83 5.11 164 2.28* 
Bottom 50.75 4.46 53.83 3.98 18.54 —2.07° 


s TWo-tailed Wilcoxon matched-pairs signed-ranks test. 


* Two-tailed ¢ test for paired samples. 
d Wi 


° Of the two groups on AHS scores, which were 53-38, 38-33 and 33-20, respectively. 
ith n = 12, a two-tailed p of .05 corresponds to a T of 14. 


s With 7% = 12, a two-tailed p of .05 corresponds to a ¢ of 2.201. 


“p< 
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group emerges. Admittedly, the advantage 
seems to lie with Group K on a majority of 
variables, but not on what is probably the 
key one—hours of work (see Table 2). 
Here, there is an overall suggestion that 
Group NK were the more hard working in 
absolute terms. However, none of the dif- 
ferences was significant and it is not sur- 
prising therefore that there was not a sig- 
nificant overall difference in performance, if 
one assumes a positive relationship between 
those two variables. There is striking sup- 
port for that assumption at the individual 
level, for of the 12 subjects (7 from Group 
K and 5 from Group NK) who averaged 
less than 50% in examinations, 10 reported 
hours of work below the means for their 
groups. 

Turning to the sections, the nonsignifi- 
cant superiority in performance of Group 
NK’s top third is in line with their higher 
work level which (see Table 3), though not 
reaching significance, was the largest differ- 
ence for the three pairs of sections. The fact 
that two subjects in Group K’s top section 
averaged less than 50%, whereas none in 
Group NK’s did, suggests the possibility of 
a limited complacency effect, but the gen- 
eral body of evidence reported in the Ex- 
ploration section is not in favor of that pos- 
sibility. 

As for the middle sections, the figures in 
Table 1 suggest that Group K’s significant 
superiority may have come as much from 
an abnormally poor performance by Group 
NK as from an enhanced performance by 
Group K. However, the result is in line with 
the detailed Exploration findings, which 
give the advantage to Group K on all the 
variables including hours of work, although 
rather surprisingly the difference on that 
variable was not significant. It is particu- 
larly interesting that the result favors that 
section of Group K which emerges as being 
maximally motivated in terms of subjective 
probability of success—an outcome which 
supports the application to this situation of 
the theory of achievement motivation as 
described below in the Exploration section. 

The near-significant difference in favor of 
Group NK for the bottom sections is the 
most emphatic reversal of results, for in the 
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first study the superiority of Group K's bot. 
tom section almost reached the .001 level, 
Rather surprisingly, again, it is associated 
with a large, though nonsignificant, differ. 
ence in work level favoring Group K, a fact 
that runs counter to any suggestion that 
knowledge of test scores may have had a 
demoralizing effect on this section. The 
other Exploration variables offer no clear 
guide to performance. There is some further 
small support for the “danger-zone” facili- 
tation effect in that three of the four sub- 
jects in the bottom 10% (the danger zone) 
of Group K performed better than their 
counterparts in Group NK. However the 
fourth performed particularly badly, as did 
three others in Group K’s bottom section, 
and it was the specially poor performance 
of these four subjects that accounted for 
Group NK’s superiority. 


EXPLORATION 


The present study’s exploration of the 
motivational significance of knowing one’s 
test scores concentrated on five dependent 
variables, chosen for their theoretical rele- 
vance and their amenability to measure- 
ment. 


Hours of Work 
Hypothesis 


The influence of Festinger’s (1954) social 
comparison theory may be seen in the inter- 
pretation offered in the previous study. ! 
listing some of the manifestations of social 
comparison Festinger suggested that in à 
situation where intelligence is in question 
person may, after comparison, work harder. 
Following that suggestion, and bearing 
mind the earlier finding that Group K te- 
ported working harder than Group NK, it | 
was hypothesised that Group K should ze 
port doing more work than Group NK fol- 
lowing knowledge of results, but that 
effect might require time to manifest itself. 


Results 


The weekly averages are shown in Table 
2. The replies (at the third questioning) for 
the three sections of each group are of spe 
cial interest, and they are summarized Y! 
Table 3. 


omo oc oor —————————— — 
—M 
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Discussion. 

It is clear from Table 2 that in absolute 
terms there was no increase of work re- 
ported by Group K following knowledge of 
test scores, but such an increase was a 
striking finding in the previous study. How- 
ever, there are indications of a relative 
overall effect beneficial to Group K in that 
the control group (NK) showed a highly 
significant drop in hours over time (almost 
reaching the .001 level for the difference be- 
tween their second and third reports) 
whereeas Group K, starting at a lower level, 
virtually maintained that level. (The slight 
decrease was not significant.) In relative 
terms, therefore, the hypothesis received 
some support. 

Table 3 shows that at the time of the 
third report (when any effect would have 
had ample time to appear) Group K's sec- 
lions reported more work as one moves 
irom top to bottom, while Group NK's 
showed the reverse effect. Looking more 
closely at the top and bottom sections of 
Group K, its top third dropped to 17.0 
hours from a work level of 19.4 hours in 
their first report. At first glance, therefore, 
it seems that knowledge of test scores may 
have produced a slackening of effort due to 
tomplacency—an ill-effect sometimes pre- 
dicted for the disclosure of test scores to 
those scoring highly. Against that, further 
analysis of the data showed the difference 
to be nonsignificant and that the top third 
of Group NK also reported a decrease, 
though a smaller one, over the same period. 
There is therefore no reason to conclude 
that knowledge of test scores significantly 
depressed the work level of Group K’s top 
Section. 

Another common claim is that such 


a TABLE 2 
Rours K anp NK Comparep on Hours SPENT 
WEEKLY on PRIVATE STUDY 


Time of reply* Group K |Group NK| Difference 
Before 
i 19.1 | 20.5 | —1.4 (ns) 
Innediately after | 18.8 | 21.3 | —2.5 (ns) 
monthsfafter 18.2 | 18.0 2 


^ In relatio; ivi i ing of 
tait kote n to the giving or withholding 
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TABLE 3 


Hours Spent WEEKLY ON PRIVATE STUDY BY 
Srctions or Gnours K anp NK* 


Section GroupK | GroupNK Difference 
Top 17.0 22.4 —5.4 (ns) 
Middle 17.8 16.2 1.6 (ns) 
Bottom 19.9 15.3 4.6 (ns) 


* As reported 3 months after giving or with- 
holding test scores. 


knowledge disheartens those with low 
scores, but there is no evidence whatever of 
such an effect on the work level of the bot- 
tom section of Group K. On the contrary, 
that section as a whole showed a marked 
increase (from 17.1 to 19.9 hours) between 
their first and third reports. What is even 
more interesting, the bottom seven subjects 
(i.e., those in, or closest to, the academic 
danger zone as defined by the failure rate) 
showed an increase significant beyond the 
.05 level, whereas their counterparts in 
Group NK showed a decrease significant 
beyond the .01 level. This confirms the ear- 
lier study's finding that knowledge of test 
scores had its maximum facilitating effect 
on subjects whose low scores indicated that 
they might be in greatest danger of exami- 
nation failure. 


Subjective Probability of Success 


Hypothesis 1 

The realism of subjects’ expectations is 
reflected in the degree of correlation be- 
tween their AH5 scores and estimated prob- 
ability of success in examinations. Such a 
relationship is an assumption in Atkinson 
and Feather’s (1966) work on achievement 
motivation in which it has been found 
(O'Connor, Atkinson, & Horner, 1963; 
Smith, 1964) that individual differences in 
intelligence can determine subjects’ subjec- 
tive probability of success (P;). It was hy- 
pothesized therefore that the correlation be- 
tween AH5 score and P, would be higher 
for Group K than for Group NK. 


Results and Discussion 


The hypothesis was not supported and 
the lack of correlation is not in agreement 
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with the finding in Atkinson's work that 
knowledge of ability scores helps to define P,. 


Hypothesis 2 


A question outstanding from the previous 
study is the relationship between Atkinson's 
theory of achievement motivation and the 
finding that, Group K performed better than 
Group NK. The subjects in that study were 
viewed as being in a constrained situation 
(facing only one task, with no alternative 
offered) and Atkinson holds that in such a 
situation strength of motivation to perform 
the task should be greatest when P, is .5. It 
was proposed that knowledge of test scores 
could have had a centrifugal effect on 
Group K, leading its members to develop 
widely scattered P, values. By contrast, ig- 
norance of test scores could have had a cen- 
tripetal effect on Group NK, in that their 
greater uncertainty could have made them 
cluster closer to a P, of .5, which would give 
rise to the prediction that Group NK would 
perform better than Group K. 

This lack of agreement between Atkin- 
son's theory and the previous experimental 
finding can be avoided by viewing the situ- 
ation as a selective one rather than a con- 
Strained one. That is, the subjects may be 
seen as confronting a choice of tasks in that 
success may be defined in terms of a variety 
of examination targets, varying in number 
and level of passes over and above the pre- 
scribed minimum. In such a situation, 
knowledge of test scores could conceivably 
have made it easier for Group K than for 
Group NK to choose tasks of intermediate 
difficulty (with P, at or near .5), and 
thereby to maximize strength of motivation. 
This seems a more appropriate view, and so 
it was hypothesized that Group K’s P, val- 
ues would cluster closer to 5 than Group 
NK's. 


Results and Discussion 


In an attempt to quantify the raw trends 
found, and to compare strength of motiva- 
tion in the two groups, an index of motiva- 
tional strength was constructed. Its basis 
was the inverted-U function relating P, and 
motivational strength proposed by Atkinson 
(1957). Table 4 shows the results. The dif- 
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TABLE 4 


Gnours K AND NK COMPARED on MEAN 
MOTIVATIONAL SCORE 


Soup iA itunes ipf Difference 
K 3.37 4.04 ~.67* 
NK 3.73 3.64 .09 (ns) 


UT TEE t HAUS TUE REPE ee NESUMEETMO 
Note.—The scores refer to a scale running from 
0 to 5. 
* p < .05 (one-tailed t test for paired samples). 


ference between the two groups was nonsig- 
nificant on both occasions, but the second 
one nearly reached the .05 level on a one- 


tailed ¢ test for paired samples. Taking into | 


consideration also the significant increase 
for Group K, the results are in the expected 
direction and lend support to the hypothe- 
sis. 

The results for the sections of the two 
groups are given in Table 5. No section in 
either group underwent a significant change 
of score, but it is noteworthy that Group 
K’s middle section showed the largest in- 
crease. This may be due to the operation of 
a social norm restricting the choice of ex- 
amination targets by subjects, for it was 
found in collecting data that the most pop- 
ular target was to pass all five examinations 
with safety. Such a norm would come clos- 
est to a task of intermediate difficulty for 
the middle and top sections of Group K, 
and those sections should, on Atkinson's 
theory, have maximum motivation after 
knowledge of test scores. The scores support 
this view and provide some explanation of 
the examination superiority of Group K’ 
middle section (Table 1). 


TABLE 5 


Mean MOTIVATIONAL Scores ron SECTIONS OF 
Grour K anp Grour NK 


Group Section x Lee second tim 

K Top 3.52 4.25 
Middle 3.35 4.82 
Bottom 3.35 3.52 

NK Top 3.62 3. 2 
Middle 3.80 3.9 
Bottom 3.78 3.68 
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Other Dependent Variables 


These three seem to be of less impor- 
tance, and so discussion of them is summa- 


$ rized. 


Advantage Taken of Advice on Methods of 
Study 


The hypothesis that Group K would 
make more use of such advice than Group 
NK was supported, although the difference 
was not so emphatic as it was in the pre- 
vious study. It was restricted to the middle 
sections, which is in line with the perform- 
ance difference reported in Table 1. Further 
evidence of the relevance of this variable to 
performance is seen in the fact that all but 
two of the 21 subjects who obtained advice 
averaged at least 50% in the examinations 
whereas all the other 10 who averaged less 
than 50% were subjects who had not ob- 
tained this advice. 


Degree of Satisfaction with Academic Life 


The hypothesis that Group NK would re- 
port less satisfaction than Group K after 
knowledge of test scores received some sup- 
port. Again the difference in favor of Group 
K was most marked regarding the middle 
sections. 


Anxiety 


The hypothesis, that Group K's scores on 
the AAT facilitating scale would increase, 
obtained some support, and once more most 
Inereases occurred in the middle section. A 
Complementary finding was that Group K’s 
Score on the TAQ decreased very signifi- 
cantly (p < .001) whereas Group NK’s did 
not change. Yet again the effect was most 
marked for the middle sections. 


CONCLUSION 


As already mentioned, the general picture 
Presented by the Exploration section does 
Ot permit an unequivocal prediction of 
Pi tiano superiority for either of the 
"i € sections, and so the inconelusive Te- 
vig, ns as no surprise from that point of 
; W. On the other hand, there are several 
nts of a superiority of Group K's middle 
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section over Group NK's, which the result 
bears out. It is clear that the overall con- 
clusion from this replication must be that 
psychometric feedback does not lead consist- 
ently to improved performance: The effect 
is not overwhelmingly in one direction or 
the other, but is heavily dependent on the 
nature of the situation and the subjects. 
There are strong indications that the crucial 
variables include work level, achievement 
motivation, advantage taken of advice on 
methods of study, and the current failure 
rate. However, further research is needed to 
clarify the complex interaction of variables 
in the situation, and it may be useful to 
concentrate in future on the detailed use 
which individual subjects actually make of 
feedback experimentally manipulated so as 
to differ in form and amount. Also, in view 
of the contradictory findings of these two 
studies, special attention should be paid to 
the effects on subjects with the lowest abil- 
ity scores in their groups. 
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EFFECTS OF GRADES AND DISCONFIRMED GRADE 
EXPECTANCIES ON STUDENTS' EVALUATIONS 


OF THEIR INSTRUCTOR 


DAVID 8. HOLMES 
University of Kansas 


Half of the students who deserved and expected A's or B's were given 
their expected grades, while half were given a grade one step lower 
than expected. After receiving the grades, the students filled out the 
Teaching Assessment Blank. A 2 X 2 analysis of variance revealed that 
the evaluation of the instructor was lowered on only 1 of the 19 items 
as a function of differences in grades, but evaluations on 10 of the 19 
items were lowered as a function of the unexpected lowering of grades. 
It was concluded that although differences in actual grades do not 
affect evaluations, if students' grades disconfirm their expectancies, the 
students will tend to deprecate the instructor's teaching performance 


in areas other than his grading system. 


A considerable amount of research has 
been done on the potential influences of ex- 
trateaching variables on students’ evalua- 
tions of their college instructors. One varia- 
ble often examined is the grades the stu- 
dents received from the instructor to be 
evaluated. In general, results have been 
somewhat inconsistent; in some cases 
grades were found to be related to the stu- 
dents’ evaluations of their instructors (e.g., 
Anikeeff, 1953; Riley, Ryan, & Lifshitz, 
1950) and in others they were not (e.g. 
Eckert, 1950; Hudelson, 1951; Voeks & 
French, 1960). When relationships between 
grades and evaluations were found, it was 
often assumed that the grades had caused a 
halo, thus introducing error variance that 
would decrease the validity of the evalua- 
tions. There are, however, at least two al- 
ternative explanations. One revolves around 
the fact that the effects of grades usually 
were confounded with the students’ abili- 


*This research was carried out during the au- 
thor's tenure at the Measurement and Evaluation 
Center, University of Texas, Austin. The author 
would like to thank Paul Kelly, Cardine Dowell, 
and Nancy Earl for their help and cooperation. Re- 
quests for reprints should be sent to David S. 
Holmes, Department of Psychology, University of 
Kansas, Lawrence, Kansas 66044, 


ties. Perhaps, rather than grades influencing 
evaluation responses, both were a function 
of the students' abilities, thus causing the 
observed relationship between grades and 
evaluation responses. For example, bright 
students who received higher grades than 
their less-bright classmates may have been 
able to appreciate or get more from an in- 
structor and, therefore, evaluated him 
higher than the less-bright students. An- 
other explanation for the reported relation- 
ship between the grades and the evaluations 
revolves around the question of whether the 
grades were consistent with what the stu- 
dents expected. In the event of a discon- 
firmed grade expectancy when, as usually 
happens, a student received a grade lower 
than expected, some degree of tension, dis- 
sonance, or imbalance would result. To re- 
solve this, the student might deprecate his 
instructor’s teaching as a way to accoun 
for his own unexpectedly poor performance. 
There is already some experimental evi 
dence (eg, Bramel, Bell, & Margulis, 
1965) that subjects will change their atti- 
tudes or project characteristics to resolve an 
inconsisteney and provide an explanation. 
This effect may have contributed to the 
previously reported relationship between 
grades and evaluations, since these investi- 
130 
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STUDENT'S EVALUATIONS OF THEIR INSTRUCTOR 


gations were carried out without regard to 
the possible discrepancies between expected 
and received grades. 

The present investigation was carried out 
to test the effects on students’ evaluations 
of their instructor of (a) differences in ac- 
tual grades and (b) disconfirmed grade ex- 
pectancies. The data from this investigation 
are of value because they provide informa- 
tion relevant to theoretical issues associated 
with disconfirmed expectancies and the 
practical problem of the validity of stu- 
dents’ ratings of their instructors, a problem 
which is gaining in importance and atten- 
tion (Eble, 1970; Holmes, 1971; Mc- 
Keachie, 1969). In this experiment, students 
were given final course grades which were 
either what they expected or one step lower 
than what they expected. After receiving 
their grades, they were asked to complete 
the Teaching Assessment Blank (Holmes, 
1971). With these data it was possible to 
determine the effects of differences in actual 
grades and the effects of disconfirmed ex- 
pectancies on students’ evaluations of an 
instructor. 


METHOD 


Grades: Actual and Expected 


Four multiple-choice examinations were ad- 
ministered quarterly during the semester. Actual 
final grades were determined by summing the 
number of correct items over all four examinations 
and constructing a final grade distribution. After 
each of the first three examinations, students 
were told the number of items they had answered 
Tuy and then given a “suggested grade distri- 

ution” that indicated the approximate letter 
i equivalents. The expected grades were de- 
1 vam by asking the students to indicate on the 
b examination what final grade they expected 

n the course. In asking this, it was specifically 
Pointed out that this would in no way affect their 
actual final grade. 


Subjects and Conditions 


pa isty seven students in a class in introductory 
able met the requirements to be used as 
Hi d cts in this experiment: actual grade of À or B 
deers grade the same as the actual grade. 
rtd A students, 20 randomly selected students 
Heel final feedback indicating that they had 
fon d an A, while the remaining 19 students 
meee final feedback indicating that they had 
selected c D: Of the 58 B students, 28 randomly 
ed subjects received final feedback indicating 
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that they had received a B, while the remaining 30 
received final feedback indicating that they had 
received a C? 


Procedure 


The last examination (ie. the one on which 
subjects indicated their expected grades) was 
administered during the next to last class, At the 
final class, computer printouts were posted, list- 
ing the students' names and the total number of 
items they supposedly had answered correctly over 
the four examinations. The names of the experi- 
mental subjects whose grades were being distorted 
were followed by a “total number correct" figure 
which was lowered enough so that when the distri- 
bution was later announced their grades were one 
letter grade below what was expected. This lower- 
ing could be done since the students did not know 
how well they had done on the last examination. 
The control subjects were given true feedback con- 
cerning the number of items answered correotly. 

After all the subjects had an opportunity to 
check their numerical scores, the Teaching Assess- 
ment Blank (Version 3) forms were distributed. The 
students were asked not to fill them out until 
instructed to do so. The instructor then presented 
the distribution of final grades which indicated the 
ranges of numerical scores covered by the letter 
grades. As soon as the students knew their grades, 
they were asked to fill out the Teaching Assess- 
ment Blank forms. When the forms were completed 
and collected, the instruetor revealed the decep- 
tion, informed the students of the purpose of the 
experiment, and told them their actual numerical 
and letter grades. With regard to the grade mani- 
pulation, it is important to note that the unex- 
pectedly low grades did not stem from an un- 
expected change in the instructor’s grade 
distribution or curve, but rather from the fact that: 
the students were led to believe that they had 
done worse on the last examination than expected. 
This is important because with the situation con- 
structed in this way the students thought the un- 
expected grades stemmed from their performance 
rather than from the unexpected behavior of the 
instructor and therefore the students could “really” 
only blame themselves. It should also be noted that 
the procedures were carried out in the normal 
class with all students present. Therefore, the 
students whose responses were being used as data 
for the experiment had no idea that an experiment 
was being conducted or that they had been singled 
out for participation. While many students were 
understandably very concerned about their lower 
grades, none voiced any suspicion concerning a 
deception on the part of the instructor. 


3Gtudents receiving O's were not used in this 
experiment because there was a question of whether 
their grades could be credibly manipulated to the 
D level. This precaution may limit the generality 


of the findings. 
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RESULTS AND Discussion 


` A 2 x 2 analysis of variance (actual A 
versus actual B and expected grade versus 
lower grade) was carried out on the sub- 
jects’ responses to each of the 19 class-re- 
lated evaluation items. If differences in 
grades or ability were related to differences 
in evaluation responses, then significant 
main effects due to grades would be ex- 
pected. On the other hand, if the previous 
results had been due to disconfirmed expec- 
tancies, then no significant main effects due 
to grades would be expected. The effects of 
disconfirmed expectancies were tested di- 
rectly by the second main effect of each 
analysis which reflected the differences in 
evaluation responses due to receiving the 
expected grades versus receiving unexpect- 
edly lower grades. 


Effects of Grades 


On only 1 of the 19 items did the main 
effect of grades approach significance; on 
Item 7, students who deserved A’s indicated 
that they thought the instructor was better 
prepared for lectures and discussion than 
did students who deserved B’s (F = 3.28, p 
= .07).* Since one relationship at this level 
could have been found by chance, and over- 
all the probabilities associated with the ef- 
fects of grades were high (X = .47), it can 
be concluded that with this sample, stu- 
dents’ grades did not influence their evalua- 
tions of their instructor. 


Effects of Disconfirmed Expectancies 

In contrast to the lack of findings asso- 
ciated with differences in grades, 5 of the 10 
items were significantly affected (p < .05) 
by the unexpected low grades, and there 
were strong trends (p < .10) in five others. 
Consistent with what would be expected, in 
all of these cases the students whose grades 
had been lowered gave the instructor poorer 
evaluations. Three of the five items on the 
Instructor Presentation subscale were influ- 
enced; that is, relative to those who got the 
grades they expected, the students who were 
given grades lower than expected felt the 


*In this and all subsequent comparisons, the 
probabilities are based on df = 1/86. 
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instructor was not well prepared (Item 7E 
= 6.18, p < .01), he did not present mate- 
rial in a coherent manner (Item 9, F = 
3.68, p = .06), and his examples did not 
clarify the material (Item 8, F = 3.27, p = 
07). The groups did not differ, however, on 
Item 10 on this scale which measured 
the degree to which students felt the in- 
structor was aware of whether the class was 
following his presentation or on Item 16 
which asked whether the instructor revealed 
enthusiasm for his teaching. Two of the five 
items on the Evaluation-Interaction sub- 
scale were influenced by the disconfirmed ex- 
pectancy. Students with lowered grades felt 
the instructor did not have sufficient evi- 
dence to evaluate their achievement (Item 
20, F = 5.08, p = .03) and that he did not 
return assignments and tests promptly 
(Item 19, F = 3.07, p = .08). It is interest- 
ing to note that the items measuring the 
degree to whieh students felt free to ask 
questions and disagree (Item 13), the de- 
gree to which the instructor respected stu- 
dents as individuals (Item 23), and how 
fair and impartial he was with students 


(Item 14) were not influenced by the unex- - 


pected low grades. Of the seven items on the 
Student Stimulation subscale, four were in- 
fluenced by disconfirmed expectancies. 
First, students with lowered grades were 
more likely to indicate that they got less 
than expected from the course (Item 30, F 
= 7.06, p < .01), a response which may 
have been based in part on the instructor's 
apparent evaluation of their achievement in 
the course. Further, however, they indicated 
that they thought the instructor less intel- 
lectually stimulating (Item 15, F = 6.19, p 
= .01), that they had been less stimulated 
to work beyond what the course require 
(Item 12, F = 2.87, p = .09), and that their 
attention had been held less in class (Item 
11, F = 2.87, p = 09) than was the case 
with students receiving the higher grades. 
On the other hand, the groups did not differ 
in terms of the point in the course ee 
they began to understand the objectives 9 
the instructor (Item 31), the degree to 
which they looked forward to attend 
class (Item 25), or the amount of effo 

they devoted to the class (Item 29). The 
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only significant interaction found was asso- 
dated with this latter item; subjects who 
deserved A's and received B's reported they 
worked harder than those who deserved A's 
and received A's, a response tendency which 
may reflect the frustration of the former 
group. Last, the item on the Test Clarity 
subseale which asked if the examination 
questions were clear was influenced (Item 
18, F = 427, p = .04), though the other 
item which asked if the instructor gave ade- 
quate instructions concerning assignments 
(Item 17) was not influenced. 

These results indicate that differences in 
grades did not influence students’ evalua- 
tions of their instructor when these differ- 
ences were expected. Because differences in 
grades are often thought to be associated 
with differences in ability, it might also be 
concluded that differences in ability did not 
influence students’ evaluations of their in- 
structor. On the other hand, when students’ 
grade expectancies were disconfirmed and 
they received grades lower than expected, 
they deprecated the instructor’s teaching 
performance. While this deprecation could 
have been displaced aggression (it was not 
the instructor’s fault that they apparently 
performed more poorly than expected on 
the last examination and therefore received 
a lower grade), it is the author's opinion 
that the effect was due to the students’ at- 
tempts to justify their unexpectedly low 
grades. The fact that judgments of rather 
objective factors were influenced attests to 
the strength of the effect of disconfirmed 
expectancies. From a practical standpoint, 
it is unfortunate that disconfirmed expec- 
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tancies concerning grades do have an effect 
on students’ evaluations of the instructors. 
It might be suggested, however, that if stu- 
dents are kept adequately informed of their 
proficiency in a course and if they receive 
grades accurately reflecting their profi- 
ciency, the possibility of disconfirmed ex- 
pectancies will be decreased, and any bias 
due to disconfirmed expectancies will be de- 
creased also. In general, then, it appears 
that with the precaution suggested above, 
students’ ratings may be an effective means 
of assessing students’ opinions of their in- 
structors. 
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UNDERGRADUATE ACADEMIC EXPERIENCE! 


ARTHUR W. CHICKERING* 
American Council on Education, Washington, D.C. 


Subjective and objective environments of 13 small colleges were 
assessed using the College and University Environment Scales (CUES) 
and the Experience of College Questionnaire (ECQ). The ECQ re- 
sults suggest systematic interrelationships within different institutions 
among major elements of the objective environment—mental activi- 
ties in class and studying for courses, role of the teacher, reasons for 
studying, feelings about courses, patterns of work—and suggest that 
varied approaches to curriculum, teaching, and evaluation alter the 
daily academic experiences of students with potential consequences 
for intellectual competence and other aspects of student development. 
The CUES results, though roughly consistent with ECQ findings, dif- 
fered sufficiently to suggest that further study of college environments 
should include both subjective and objective measures, and that edu- 


cational planning should recognize both environments. 


Research on college environments has in- 
creased rapidly during the last 10 years. 
Most studies have examined institutional 
"press" or “climate” as reflected by stu- 
dents' self-reported perceptions of general 
practices and behaviors. The College Char- 
acteristics Index (CCI; Pace & Stern, 
1958) was the first instrument of this sort 
to be widely used. More recently, studies 
employing the College and University En- 
vironment Scales (CUES)—a shorter and 
substantially revised version of the College 
Characteristics Indez—have supplemented 
the earlier research. "Thus, study of college 
environments at the level of perceived press 
has been relatively widespread, and a rich 
pool of findings has resulted, as the reviews 
of Stern (1962), Feldman and Newcomb 
(1969), and Chickering (1969) make clear. 

As these data accumulated for diverse 
kinds of colleges and students, a major 


*This research was undertaken in the context 
of the Project on Student Development in Small 
Colleges, supported by United States Public Health 
Service Research Grant MH14780-05, National In- 
stitute of Mental Health. Credit is also due the 
Office of Research of the American Council on 
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* Requests for reprints should be sent to Arthur 
W. Chickering, who is now at Empire State Col- 
lege, 2 Union Avenue, Saratoga Springs, New York 
12866. 


problem became apparent. A student’s per- 
ceptions, and the institutional scores gener- 
ated by the aggregated perceptions of many 
students, are influenced not only by the ob- 
jectifiable characteristics of the college but 
also by his frame of reference, by his prior 
expectations about the college, and by his 
background. In short, the subjective envi- 
ronment can be very different from the 
objective environment. This observation is 
not new. Psychologists will recall Henry 
Murray’s (1938) discussion of “alpha 
press” and “beta press,” and his warning 
that these two types of environmental 
forces, though usually interrelated, are not 
identical. 

Two other approaches, which usefully 
supplement information based on students 
general impressions, have been developed. 
One, the Environmental Assessment Tech- 
nique (Astin & Holland, 1961), defines col- 
lege environments in terms of eight charat- 
teristics of the student body: average intel 
ligence, enrollment size, and the proportions 
of the students enrolled in six broad areas 
of study called Realistic (e.g., engineering, 
agriculture), Scientific (e.g. physics, biol- 
ogy), Social (e.g., education, nursing), Con- 
ventional (e.g., economics, accounting), En- 
terprising (e.g., political science, advertis- 
ing), and Artistic (e.g., fine arts, language): 

The second, the “stimulus” approach, 
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first asks the student to describe his own 
experiences and behaviors and then takes 
the self-reports of large groups of students 
at a single institution as measures of the 
environment. Using this approach, Astin 
(1968) developed the Inventory of College 
Activities (ICA). Responses were obtained 
from 30,570 students in 246 institutions se- 
lected. to represent the population of ac- 
credited four-year colleges and universities. 
In addition to the stimulus items, the ICA 
included items asking the student for his 
subjective impressions of the college (e.g., 
It is a friendly campus; The intellectual 
atmosphere is definitely on the theoretical 
rather than the practical side). Factor anal- 
ysis of each of these two kinds of items 
resulted in a set of 27 stimulus factors and 
a set of eight “image” factors. Two of the 
image factors—Concern for the Individual 
Student and Permissiveness—substantially 
overlapped the stimulus factors, but the 
other six overlapped only moderately. 
(Astin, 1968, p. 112) 

It appears, then, that the college environ- 
ment varies depending upon how it is as- 
sessed. Different methods do not necessarily 
generate opposing results, but the results are 
sufficiently different to raise significant 
questions for both further research and edu- 
cational planning. 

The research reported here, on the aca- 
demie environments at several small col- 
leges, assessed both the subjective environ- 
ment, using CUES, and the concrete experi- 
ences and behaviors of students and teach- 
ers, using a newly constructed instrument 
the Experience of College Questionnaire 
(ECQ; McDowell & Chickering, 1967). The 
Tesults document further the need for multi- 
level environmental assessment and have 
implications not only for future research 


but also for educational planning and eval- 
uation. 


METHOD 


RA Project on Student Development in Small 
Kid dud longitudinal study of institutional 
& acteristics, student characteristics, attrition, 
Student development—was undertaken with 

me ipoDeration of 13 liberal arts colleges, all with 
Des mente of 1,500 or less. The cooperating col- 
pud er from one another substantially in both 
mn pnmental characteristics and student charac- 
cs. (For more detailed information, see 


e 
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Chickering, 1969; or Chickering, McDowell, & 
Campagna, 1969.) 

During the spring semester of 1966, CUES was 
administered at each college to stratified random 
samples of 100 students selected to reflect the pro- 
portions of men and women at each class level. The 
following spring (1967) the ECQ was administered 
to stratified random samples of students also se- 
lected to represent class size and sex distribution. 
The ECQ borrows some items from Astin’s ICA 
and draws also on the Taxonomy of Educational 
Objectives (Bloom, Englehart, Hill, Furst, & 
Krathwohl, 1956) for the cognitive domain and on 
the work of Pervin and Rubin (1966) at Princeton. 
It asks the students to report their concrete experi- 
ences and behaviors in several general areas: aca- 
demics, extracurricular activities, relationships with 
peers, student-faculty relations, and religious ex- 
periences and activities. Usable data were obtained 
from 12 of the 18 colleges (ns from 80 to 193). 

To obtain ECQ data which would represent the 
total range of academic experiences offered by a 
college without overrepresenting any particular 
type of study, 1 of 5 different hours during the week 
was checked on each questionnaire, thus: 


7:50 am. Monday 


9:50 am. Tuesday 
11:50 am. Wednesday 
1:50 pm. Thursday 

10:50 am. Friday. 


The check marks were distributed equally among 
the different hours, and the questionnaires were 
distributed randomly among the students, Each 
student was asked to name the two courses or 
independent studies which occurred after the time 
checked and to respond separately to each. He was 
told to disregard any course which was not full 
scale, credit giving, and academic (eg. physical 
education, applied music). 


RESULTS 


Six clusters of findings emerged from the 
analyses: mental activities in class, mental 
activities studying for courses, the role of 
the teacher, reasons for studying, feelings 
about courses, and patterns of work. 

To simplify reporting, this paper focuses 
on four colleges whose data reflect the range 
of student responses. They are Classic (re- 
quired curriculum, comprehensive exams, 
emphasizes intellectual competence); Kil- 
dew (no required courses, independent 
study and self-evaluation, emphasizes per- 
sonal development); Elder (traditional, 
selective, ample resources) ; and i Savior 
(traditional, strong church ties, limited re- 
sources). Full information concerning data 
for the other project colleges is available 
upon request to the author. 
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Both within each cluster and across dif- 
ferent clusters, interrelationships among 
items are apparent. Some of these interrela- 
tionships are noted. In the interest of clar- 
ity, straightforward declarative statements 
are used, omitting the usual qualifying 
phrases. Therefore, at this point, it should 
be unequivocally emphasized that these in- 
terrelationships are only hypothetical. Data 
from 12 small colleges, and detailed de- 
scriptions of only 4, are not sufficient to 
support general statements of fact, even 
when some rank-order correlation coeffi- 
cients reach statistical significance. Non- 
theless, the findings are strong enough 
and clear enough to identify fruitful targets 
for more complex and comprehensive stud- 
ies of relationships among varied aspects of 
student experiences and behaviors as they 
occur in diverse colleges. 

Figure 1 suggests that mental activities 
are systematically interrelated. The ques- 
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tion format did not require that the time 
percentages allocated to different alterna. 
tives sum to 100. There is, however, a logi- 
cal dependency among the alternatives, It ig 
difficult to spend substantial amounts of 
time listening and taking notes for purposes 
of recall while participating actively in dis- 
cussion; it is more easy to do so while git- 
ting in a lecture. Therefore, it is not surpris- 
ing that when listening and taking notes 
occupies a large proportion of class time, 
participating in discussion, presenting re- 
ports, and making speeches occupies little. 
Presumably, however, thinking about the 
ideas presented—analyzing them, thinking 
of implications, checking for soundness, 
mentally criticizing—can occur both during 
a lecture or during a discussion. However, 
the results indicated that much less time is 
given to such thinking where students spend 
major amounts of time listening to remem- 


Listening and . Making statements Thinking about 
taking notes d to the class the ideas presented 
primarily to iN 
10- remember rH -n 
t 


Percent of Students 


6-20 21-50 
Percent of Time Spent 


6-20 21-50 50+ 


Percent of Time Spent 


ber. A Spearman rank-order correlation | 


Savior e... 


Fia. 1. Mental activities in class. 
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coefficient of —.64 (p « .05) resulted when 
the 12 project colleges were ranked on the 
percentage of students spending more than 
half their time listening and taking notes 
and the percentage spending more than half 
the time doing their own thinking about the 
ideas presented. 

At Elder and Savior, for example, 
6095-7095 of the students spent more than 
half their time listening to remember, and 
more than 70% spent very little time (5% or 
less) participating in discussions. In these 
same classes, 35% indicated that they spent 
little time (20% or less) thinking reflec- 
tively about the material; indeed, only 2096 
spent more than half their time doing so. At 
Kildew and Classic, time was more evenly 
distributed between listening and taking 
notes, on the one hand, and participating in 
discussion, on the other, and about twice as 
much time was spent thinking about the 
ideas. Still, however, at these two colleges 


Synthesizing 


Memorizing 


Percent of Students 
Iz] 


0-5 620 21-50 50+ 


Percent of Time Percent of Time 


KildeW. a Leere eere 
Classic memme A 


0-5 620 2150 50+ 05 
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only about 40% of the students spent more 
than half their time in reflective thought, so 
there is still room for improvement. 

Not surprisingly, the mental activities 
used in outside study vary in ways consist- 
ent with the mental activities used in class 
(see Figure 2). At the two colleges where 
the rather passive tasks of listening and 
taking notes predominated, most of the 
time spent in class preparation was devoted 
to memorizing; whereas synthesizing ideas 
or information, applying concepts or princi- 
ples to new problems, and interpreting 
(translating, mentally reorganizing, or 
making inferences) were slighted. At the 
two colleges where listening, talking, and 
thinking were more evenly balanced, less 
time was spent on memorizing, and more on 
the higher level mental activities. 

How does the role of the teacher vary 
from one college to another? As Figure 3 
indicates, at Elder and Savior, where listen- 


Applying Interpreting 


-50 50 (5 6-20 21-50 50+ 
econ a Tis P Percent of Time 
Elder o — ae O 
Savio eg. —.—:—* 


Fro. 2. Mental activities studying for courses. 
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Fig. 3. Role of the teacher. 


ing and memorizing predominated, the 
teacher most frequently dispensed knowl- 
edge which it is the students’ job to master 
or directed his efforts flexibly in order to 
help students learn. He did not often work 
with students in the mutual pursuit of in- 
creased understanding, nor did he serve 
chiefly as a resource while students carried 
out their own plans. Consistent with this 
pattern the lectures often followed the text 
closely and open arguments between stu- 
dent and instructor and between student 
and student were relatively infrequent. At 
Kildew the teacher seldom simply dispensed 
knowledge. Most often, the student and 
teacher worked together or the teacher was 
a resource while the student carried out his 
own plans. Classic had its own pattern in 
that instructor roles were more evenly dis- 
tributed across the three major categories: 
dispensing knowledge, flexibly managing his 


Student 
sometimes 
argues openly 
with other 
students 


Lectures 


Student sometimes 
follow text i 


argues openly with 
instructor 


Elder oa Lco 
Savior e... —e 


own efforts to help students learn, and shar- 
ing learning experiences with students. With 
respect to lecturing and open arguing, the 
patterns for Classic and Kildew were again 
similar and contrasted sharply with Elder 
and Savior. F3 
Motivation for studying can be intrinsic 
(growing out of the interests and concerns 
of the student) or extrinsic (serving more 
external standards and expectations). The 
items listed in Table 1 range from intrinsic 
to extrinsic. For interest and enjoyment, to 
answer questions of concern and to master 
material and do a job well for future voca- 
tional or general use, all, in varying Ce 
grees, come from “inside” the student. T 
avoid doing badly, to get a good grade, an 
to complete a requirement are more exter- 
nally imposed reasons. de 
At Savior, the desire to get a good gra 
or to complete a requirement took prece 


s- 
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dence over the rest, and the most intrinsic 
reasons were rarely cited. At Elder, the fre- 
quencies were more evenly distributed 
across alternatives. At Classic, students 
more often said that they studied because 
of interest and enjoyment or because of 
concern about the issues. But extrinsic rea- 
sons were still very frequently mentioned. 
At Kildew, frequencies were highest for the 
most intrinsic reasons and extrinsic reasons 
played only a small role. 

Given these variations in classroom and 
study activities, in teacher behavior, and in 
motivation, what proportions of students 
feel challenged by their courses, confident 
about them, interested in them? At Kildew, 
students consistently indicated that they 
usually felt challenged: confident, effective, 
and competent; interested, eager, and at- 
tracted. Classic presents a contrasting pic- 
ture: almost two-thirds of the students felt 
challenged only rarely or occasionally, and 
although 40%-50% felt confident and inter- 
ested, such feelings were reported less fre- 
quently than at Kildew. At Elder, half the 
students said their courses were challenging 
and half said they were not: the same split 
occurred for confidence and interest. 


TABLE 1 
REASONS FOR STUDYING (IN %) 


College 
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TABLE 2 
FEELINGS ABOUT Courses (IN %) 


Feelings about courses 
folks Challenged | Confident, | Interested, 
to do best effective, eager, 
thinking competent | attracted 
Savior 
Rarely 42 53 41 
Frequently 56 50 53 
Elder 
Rarely 50 55 41 
Frequently 50 45 53 
Classic 
Rarely 64 46 4l 
Frequently 37 52 55 
Kildew 
Rarely 33 31 20 
Frequently 67 69 75 


Note.—Each item offered four response alterna- 
tives: 1 = rarely or never; 2 = occasionally; 3 = 
frequently; 4 = most or all of the time. (Figures 
combine 1 and 2, and 3 and 4.) 


Though Savior students more frequently 
felt challenged the proportion who felt con- 
fident and interested was about the same as 
among Elder students. 

The overall picture presented by Table 2 
is discouraging. Not more than two-thirds 
of the students at any college often find 
their courses challenging, nor feel confident, 
effective, and competent in them. Something 
is amiss. Curricular patterns, evaluational 
procedures, and teaching practices clearly 
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ps » because it interested College 
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understand better. j|*e99 a) SS S RU le lorem 
0 broaden my general knowl- I coast along, seldom make real 
edge, understanding, back- effort. 2| 2| 1) 0 
ground. 14 | 14 | 12| 19 Iusually coast but work fairly 
© have a sense of mastering hard at times. 14| 19 | 28| 8 
the material, of doing a job I work at a moderate level but 
m 13 | 15 | 15 | 12 seldom push myself. 14 | 17 | 17 | 14 
0 learn something that will be I work at a moderate level and 
Useful voeationally or in sometimes quite hard and 
other activities later on. 12| 8| 4] 12 long. 51 | 44 | 44 | 49 
nec doing badly, getting I work fairly intensely most of 
qeehind (or further behind). | 12 | 16 | 11 | 4 the time, and very hard and dt 
2 Bet à good grade. 19 | 13} 10] 1 long at times. 16) 17) 9 
© finish another requirement I nearly always work about as 
Oward graduation; to get long and hard as I possibly 
academic credit. 15| 9|13| 4 gan: VE 
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7 Studying for Class 


Percent of Students 


9-2 1329 3039 40. 0-3 


Reading for Pleasure 


4^] 842 134 
Hours Per Week 
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4 
å Watching TV 


i 

i Hi 
i 

U 


0 13+ 


1- 6-12 
Hours Per Week Hours Per Week 
Kildew. aree Elder. oe e ee o 
Classic ag Saviot e. — . — . —. e. 


Fic. 4. Average number of hours invested per week. 


need attention. And given the diversity of 
Students and the range of approaches that 
characterize the four very different institu- 
tions, it is apparent that none of them has 
the answer to being educationally effective 
for all, or even most, of its students. 

Do these differences among institutions 


TABLE 4 

KazPING UP TO Darn ON Course ASSIGNMENTS 
(iN %) 
College 
Keeping up to date 

EIE 
$$ TIR (um 

Ihave almost always been be- 
hind on my assignments. 7| 16) 17} 3 


More often than not, I have 

been behind on assignments. | 34 | 22 | 91 8 
More often than not, I have 

kept my assignments up to 


date. 39 | 42 | 36 | 44 
I have almost always kept my 
assignments up to date. 20 | 24 | 27 | 45 


have some effect on how hard students 
think they work? Apparently not. As the 
similar percentages in Table 3 indicate, for 
the first time all four institutions conformed 
closely to a single pattern. The only excep- 
tion, and a slight one at that, is Classic 
where more students coasted or worked spo- 
radically, and fewer worked intensely. . 

The colleges varied somewhat in keeping 
up with assignments: Kildew students more 
frequently kept up to date, and Savior stu- 
dents more frequently fell behind. The con- 
gruence between hours spent studying fo 
class, as shown in Figure 4, and EN, 
patterns of work, shown in Tables 3 and i 
reinforces the notion that most students e 
the four diverse colleges worked at n 
the same pace. Classic students deviate 
most markedly from the others in ios 
spent studying, as they did in the othe 
areas: only 10% spent more than 40 d 
per week studying, and about 16% studie 
from 0 to 7 hours per week. 
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Variations also were apparent in time 
spent reading for pleasure and watching tel- 
evision. Kildew students spent substantially 
more time reading and less time watching 
television than did any of the others. Clas- 
sie students spent somewhat more time 
reading than watching TV, Savior students 
spent more time watching TV, and Elder 
students did little of either. 

In general, these varied findings pair 
Elder and Savior. Students at these two col- 
leges responded in much the same way to 
almost all the items, and the profiles of the 
two institutions were very similar. The pro- 
file for Classic and Kildew differed consist- 
ently from those of Elder and Savior, but 
they also differed from one another, princi- 
pally in reasons for studying, in feelings 
about their courses, and in their less fre- 
quent use of instructors as resources for 
their own independent learning. 

How did these institutions score on the 
College and University Environment Scales, 
which had been administered to similar 
samples of students the year before? Table 
5 reports the results for four scales most 
pertinent to the academic area: 


Quality of Teaching and Faculty-Student Re- 
lationships, an atmosphere in which professors 
are perceived to be scholarly, to set high stand- 
ards, to be clear, adaptive, and flexible; Scholar- 
ship, an environment characterized by intellectu- 
ality and scholastic discipline; Awareness, 
awareness of self, of society, and of aesthetic 
stimuli; Practicality, both vocational and collegi- 
ate emphases, 


On the Quality of Teaching scale, all four 
colleges score high and close together, de- 
Spite the substantial differences in concrete 
oda and experiences reflected by the 
à rte findings reported earlier. On the 

cholarship scale, Elder and Savior are 
again paired, both scoring above the fiftieth 
Fwd while Classic and Kildew join 
ach other below that mark. On the Aware- 
ad seale Classic and Kildew remain close 

re and although they score high, 

l i Scores even higher, leaving Savior 
ho. the lowest score. On the Practicality 

ale, Elder, Classic, and Kildew score to- 
gether again and deviate even more dra- 
matically from Savior. 

Detailed examination of the varied rela- 
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TABLE 5 


PERCENTILE EQUIVALENTS OF COLLEGE AND 
University ENVIRONMENT SCALES SCORES 


Scale 

Quality of 
College teach, 

faulty. Scholar- [Awareness | Practi 

relation- 

ships 

Classic 80 43 76 8 
Elder 99 98 91 5 
Kildew 72 27 80 5 
Savior 82 73 65 57 


Note—Based on reference group of 100 Col- 
leges and Universities; Technical Manual. (Second 
Edition, Table 5.) 


tionships among the CUES scores and the 
reasons for these variations is not appropri- 
ate to the purposes of this paper. In addi- 
tion, it should be recognized that the two 
instruments—CUES and ECQ—differ not 
only in method, but in content, as do most 
of the other instruments currently used to 
assess college environments. The important 
point, however, is that CUES and ECQ 
both address similar areas of concern and 
that colleges grouped close together on one 
measure are far apart on another. Further, 
the differences are sufficiently great that 
judgements about the quality of teaching 
and student-faculty relationships would 
probably vary depending upon which in- 
strument was used. 


Discussion 


impressions, may not even be roommates on 


port their specific daily experiences and be- 
haviors, although they may remain in the 


one instrument may find themselves friends 
or acquaintances ‘ 
these discrepancies major questions need 


further study. 4 . 
First, what are the major determinants of 
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a student’s general impressions of a college? 
Where does his “image” of the college come 
from? How much do the college catalog, the 
campus newspaper, the college’s publicity 
releases, and other institutional literature 


contribute? How much is his overall per- 


ception dominated by the most visible and 
vocal student minorities? In what ways do 
these on-campus forces interact with the 
student’s precollege expectations and back- 
ground? For example, both Elder and Sav- 
ior, whose students shared similar academic 
experiences, are relatively traditional lib- 
eral arts colleges. Elder is fairly wealthy, 
both in its physical plant and in its finan- 
cial resources, prestigious, and selective; it 
has a high proportion of PhDs on its fac- 
ulty, and it pays them well. Savior is less 
well known, less affluent, less selective, and 
operates with more limited facilities. Is it 
these differences in prestige and wealth that 
account for Elders higher scholarship 
scores on CUES, despite the similarity in 
academic experiences of Elder and Savior 
students? What accounts for the similar 
scores on quality of teaching at Classic and 
Savior, when teaching styles and student 
experiences vary so greatly? 

Second, are the student’s subjective and 
generalized impressions systematically re- 
lated to his daily experiences and behav- 
iors? Where do close relationships occur, 
and where do subjective impressions and di- 
rect experience remain more or less com- 
partmentalized and insulated from one an- 
other? What conditions and practices sus- 
tain such insulation? 

Third, what interrelationships are there 
among the student’s direct experiences and 
behaviors. Do those suggested here for the 
academic area occur at other institutions 
and for other kinds of students? Are there 
relationships across general areas? Do expe- 
riences and behaviors in the academic do- 
main interact significantly with areas like 
student-faculty relationships and peer rela- 
tionships? For example, two indices of con- 
tact with faculty (a) the number of faculty 
members whom the student talked with 
outside of class, and (b) the number of dif- 
ferent out-of-class conversations were found 
to correlate —.41 and —.40, respectively, 
with the amount of time spent listening and 
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taking notes in class; rs for amount of time 
spent memorizing material for class prepa- 
ration and the two indices of contact with 
faculty were —.65 and —.71, respectively, p 
< .05. Is this relationship between rather 
low-level mental activities in study and rel- 
atively limited student-faculty contact a 
general phenomenon? Is there a causal 
connection between them? 

Fourth, and most important, how much 
do these different kinds of environmental 
forces contribute to learning and develop- 
ment. To put it simply, does alpha press or 
beta press have most impact? More com- 
plexly, what kinds of development are accel- 
erated or retarded by what kinds of envi- 
ronmental pressures and conditions? What 
vectors of development are most signifi- 
cantly influenced by the ongoing stream of 
behaviors and experiences which character- 
ize a college career? And what vectors are 
most significantly influenced by subjective 
impressions, by the general environment as 
one describes it to oneself and to others? 

Tn addition to the questions they raise for 
basic research, the ECQ results, together 
with Astin’s findings from the Inventory of 
College Activities, have a more immediate 
bearing on evaluative research, where the 
emphasis is on institutional decision mak- 
ing. In the past, complex longitudinal stud- 
ies have been the major models for evalua- 
tive research. But as the rate of social and 
institutional change accelerates and as pres- 
sures for fast- and far-reaching decisions in- 
crease, there is simply not enough time to 
rely on institutional self-studies spanning 4 
or more years. Moreover, the findings from 
such studies may not seem pertinent to 
new students, faculty members, and admin- 
istrators who must try to apply the results. 

Given these conditions, data concerning 
the daily activities and experiences of e 
dents provide more immediately useful an 
powerful information for program e 
and decision makers. Often when the "ur 
and bolts of concrete activities are revealed, 
clear strengths and weaknesses appear; 
there is no need to wait for more omi 
information about change in the ee 
themselves. Suppose, for example; that. si 
velopment of critical thinking is a Ew 
outcome, but memorizing is the studen 
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only mental activity as he pursues his aca- 
demic work. Must college administrators 
delay making program modifications until. 
test-retest data from longitudinal studies 
become available? Surely not. Students 
may indeed show improvements in their 
critical thinking, but it seems highly un- 
likely that such changes occur as à result of 
class meetings and out-of-class assignments. 

Evidence that a program does not foster 
the behaviors and experiences pertinent to 
desired objectives is usually sufficient rea- 
son to assume that such development is not 
taking place; or if it is, that forces outside 
the program are at work, As better ways to 
assess the ongoing stream of student experi- 
ence are developed, evaluative research 
may become increasingly useful. 

These findings have two more immediate 
implications for educational planning and 
action. First, it is clear that different ap- 
proaches to curriculum, teaching, and eval- 
uation make a substantial difference to the 
daily academic experiences of students, and 
presumably, therefore, lead to very different 
outcomes for intellectual competence, intel- 
lectual interests, and other dimensions of 
student development. Classic has a highly 
structured curriculum with a strong empha- 
sis on intellectual competence; in addition 
to tests of information, it uses comprehen- 
sive examinations designed to test the stu- 
dent's ability to synthesize and to apply his 
knowledge. Typically, teachers and stu- 
dents concentrate on close examination of 
short and diverse reading materials selected 
by committees of teachers and specially 
Prepared for particular courses. Kildew is 
experimental and progressive. It has no re- 
quired courses, and independent study may 
be pursued by all students after the first 
Pu Systematic attempts are made to 
— On off-campus experiences and re- 

rces. Student self-evaluations, instructor 
Comments, and end-of-semester conferences 
en students and instructors take the 
Place of the conventional examination and 
pon System. These differences in ap- 
Es and in practice lead to student expe- 
Nu contrast substantially with 
oues at characterize Elder and Savior, 
EOS um iypieal liberal arts college. It 
8 likely that the educational and devel- 
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opmental impact of Classic and Kildew will 

be similarly distinctive. 

Second, wealth, physical facilities, the 
proportion of doctorates on the faculty, and 
administrative and faculty salaries appar- 
ently have little effect on students’ mental 
activities in or out of class, on the roles and 
behaviors of teachers, on motives for study, 
on feelings of being challenged, confident 
and interested, and on time and effort in- 
vested in study. Thus, if students’ academic 
experiences are to be improved, energy 
should be directed not to plant develop- 
ment, buildings, and facilities, but to rela- 
tions between teachers and students and to 
the expectations and conceptual frame- 
works which influence the way they work 
together. 
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EXEMPLAR AND NONEXEMPLAR VARIABLES WHICH 
PRODUCE CORRECT CONCEPT CLASSIFICATION BEHAVIOR 
AND SPECIFIED CLASSIFICATION ERRORS: 


ROBERT D. TENNYSON? 
Florida State University 
F. ROSS WOOLLEY aw» M. DAVID MERRILL 
Brigham Young University 


Four instructional strategies for promoting the acquisition of an in- 
finite concept class were investigated. The independent variables were: 


(a) probability level of exemplars and nonexemplars determined by 
subjects who correctly classify the instance as an exemplar or a non- 
exemplar; (b) matching of an exemplar to a nonexemplar so that the 
irrelevant attributes are similar; and (c) divergency of an exemplar 
with another exemplar so that all of their irrelevant attributes differ. 
Exemplars that, share irrelevant attributes are convergent. The manipu- 
lation of the independent variables predicted four dependent variables: 
(a) correct classification; (b) overgeneralization 3 (c) undergeneraliza- 
tion; and (d) misconception. Undergraduate educational psychology 
Students enrolled at Brigham Young University were selected as the 76 
subjects. The four predicted outcomes were all significant at p < O1. 


A new concept is acquired when a person 
correctly identifies previously unencoun- 
tered objects or events (or representations 
of such objects or events) as members or 
nonmembers of a particular class. Contro- 
versy has resulted in concept research con- 
cerning the value of negative instances (non- 
exemplars) and their relationship to posi- 
tive instances (exemplars) in promoting 
concept acquisition. Smoke (1933) con- 
cluded that negative instances were of no 


1 This experiment was supported by the United 
States Office of Education Small Contract No, 
O-H-014 and Brigham Young University Depart- 
ment of Instructional Research and Development. 
The authors would like to express their apprecia- 
tion to Carol L. Tennyson for her part as the sub- 
ject matter expert in preparing the task used in 
this study. Thanks also goes to our research as- 
sistant, Lynda J. Crosby who helped in the analy- 
sis. A report of this research was presented at the 
annual meeting of the American Educational Re- 
search Association, New York, January, 1971. 

? Requests for reprints should be sent to Robert. 
D. Tennyson who is now at the Department of 
Educational Research and Testing, Florida State 
University, Tallahassee, Florida 32306. 


value in concept learning. Morrisett and 
Hovland (1959), in replication of Adams 
(1954) study of single task versus multiple 
task, found that a variety of positive in- 
stances was necessary to effect a transfer of 
concept learning. In studies of combined in- 
stances, the equivalent attributes of posi- 
tive and negative instances were found to 
be poorly utilized by human subjects, (Bru- 
ner, Goodnow, & Austin, 1956; Donaldson, 
1959; Hovland & Weiss, 1953). Callentine 
and Warren (1955) studied positive 1m- 
stances and concluded that the repetition of 
one or two instances increased attainment. 
Luborsky (1945) indicated eight exposures 
was more effective than three. E 
Concept acquisition deals with infinite 
concept classes as contrasted with finite 
classes as used in concept attainment ii 
search. An infinite class is one in which A 
of the irrelevant attributes associated wil 
a given exemplar cannot be specified. The 
procedure for presentation is deductive "| 
that the subject is told what are relevan 
attributes and then is given exemplars an 
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nonexemplars prior to the criterion task of 
identifying class membership. Once an in- 
stance has been presented and identified by 
the subject, it is no longer useful as an item 
to measure this behavior. 

Mechner (1965) defined concept acquisi- 
tion as generalization within a class and 
discrimination between classes. He pointed 
out that unless both processes were assessed 
simultaneously it was not possible to infer 
concept acquisition. In order to assess con- 
cept acquisition both exemplars and nonex- 
emplars must be presented to the subject 
and his ability to generalize to new exem- 
plars and discriminate them from nonexem- 
plars is observed. Merrill (197la) and 
Markle and Tiemann (1969) postulated 
that adequate concept acquisition would re- 
sult only if exemplars used during instruc- 
tion differed widely in the irrelevant attri- 
butes associated with each; this promotes 
generalization within the class. Also, dis- 
crimination between classes results from 
presenting nonexemplars which have irrele- 
vant attributes resembling those associated 
with given exemplars. 

Markle and Tiemann (1969) also postu- 
lated that unless the above conditions were 
met, certain classification behavior errors 
would result. These are: overgeneralization, 
undergeneralization, and misconception. 
Overgeneralization occurs when a subject 
correctly identifies all of the exemplars as 
class members, plus identifying some non- 
exemplars as members of the class, that is, 
the subject fails to discriminate between 
classes. Undergeneralization occurs when a 
Subject identifies the more obvious exem- 
Plats as class members but indicates that 
less obvious exemplars are not class mem- 
bers, that is, he fails to generalize to all 
Members of the class. A misconception re- 
sults when a subject falsely assumes that 
Some irrelevant attribute or combination of 
rea attributes is relevant. The opera- 
lona] consequence is that a subject fails to 
bene exemplars not having this attri- 
m € as class members and indicates that 

exemplars which do have this attribute 
Are class members. 

Woolley and Tennyson (1972) suggested 
ae a precise operational definition for 

Neept acquisition would result if all exem- 
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plars and nonexemplars to be used for a 
concept class were empirically rated on 
their probability of being correctly identi- 
fied by the subject when given only a defi- 
nition (list of relevant attributes). For infi- 
nite concept classes the resulting distribu- 
tion would approximate the normal curve. 
Their report rated exemplars and nonexem- 
plars on a range from high probability 
(those exemplars and/or nonexemplars cor- 
rectly classified by one-half of the subjects 
as members of a given class) to low proba- 
bility (those exemplars and/or nonexem- 
plars which are not correctly classified by 
one-half of the subjects). 


INDEPENDENT VARIABLES 


Based on the theoretical work of Merrill 
(1971, b), Markle and Tiemann (1969), and 
Woolley and Tennyson (1972), three inde- 
pendent variables were identified and ma- 
nipulated in this study: 


Probability 


All exemplars and nonexemplars, pre- 
ceded with a definition of the relevant attri- 
butes, were presented to subjects. High- 
probability items are those instances cor- 
rectly classified by 60% or more of the sam- 
ple; medium probability are those correctly 
classified by more than 30% but less than 
60%; and low probability are those in- 
stances correctly classified by less than 30% 
of the sample. 


Matching 


An exemplar and nonexemplar are 
matched when the irrelevant attributes of 


‘the two are as similar as possible. An un- 


matched relationship between exemplar and 
nonexemplar occurs when the irrelevant at- 
tributes of the two are as different as possi- 


ble. 


Divergency 


Two exemplars are divergent when the 
irrelevant attributes of the exemplars are as 
different as possible. This relationship as- 
sumes the same probability level. A conver- 
ationship occurs when the irrelevant 


t rel b 
attributes are as similar as possible. 
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TABLE 1 
Hyroruzses MATRIX ` 
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Dependent variable 


Independent variables presented 


outcome 
Probability Matching Divergency 
Correct classification All levels Matched Divergent 
Overgeneralization Low or all levels Unmatched Divergent 
Undergeneralization High level Matched Divergent 
Misconception All levels Unmatched Convergent 
HyPornuzsEs ing, then overgeneralization. (c) I, f high 


The three independent variables were 
combined to predict four dependent varia- 
ble outcomes. The predicted outcomes were 
Measured using additional unencountered 
exemplars and nonexemplars which the sub- 
ject was asked to classify without confir- 
mation. 

The hypotheses are summarized in Table 
1 and by the following statements: (a) If 
high to low probability, divergent, and 
matched, then correct classification. (b) If 
low probability, divergent, and no match- 


probability, divergent, and matching, then 
undergeneralization. (d) If high to low 
probability, convergent, and no matching, 
then misconception. 


METHOD 
Subjects 


Thirty-five spring semester undergraduate edu- 
cational psychology students enrolled at Brigham 
Young University served as subjects for the in- 
stance probability analysis. Educational psychology 
classes provided the additional 76 subjects who 
participated in the experiment. Each subject's 
grade point average was used as a covariate. 


Program 


In the classification program the exemplars were divergent in their irrelevant attributes, 
that is, rhyme, feet, length, style, author, period, ete., arranged from high to low probability, 
and the nonexemplars were matched to the exemplars in their irrelevant attributes and with 


E 
$ 
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Low Probability 


INSTANCES 


Fic. 1. Frequency distribution of exemplars and 
nonexemplars used on the instance probability 


analysis. 


similar probability. A sample of the first page of selections for the classification program 
shows high probability exemplars and nonexemplars. 


oit =a 


Divergent 


€ 
5 
$ 
3 
a 
Not an Example: od 


Out of childhood into manhood 
Now had grown my Hiawatha 
(Longfellow), 


Come to the crag where the beacon is blazing 
Come with the buckler, the lance, and the bow 
(Scott) 
Pansies, lilies, kingeups, daisies. 
(Wi ‘ordsworth) 


Motherly, Fatherly, Sisterly, Brotherly! 
(unknown) 


The overgeneralization task was constructed with divergent low-probability exemplars 
and unmatched with nonexemplars on all four pages. The first page of this program 18 
given here to contrast with the classification program above. 


Example: C3] 


292 


Not an Example: 


There they are, my fifty men and women, 
Naming me the fifty poems finished! 
(R. Browning) 


Tf the heart of a man is depressed with cares, 

The mist is dispell’d when a woman appears. 
Gay) 

Boys in sporadic, tenacious droves 

Come with sticks, as certainly as Autumn. 
(Eberhart) 


Alone, alas, 


He sat. 
(unknown) 
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The undergeneralization task had only divergent high-probability exemplars which 
matched with the nonexemplars. The first page of this program was the samo ad 
classification program, with the succeeding pages on an equal level of difficulty. The 
example shown here is the last page: 


Example: Out of friendship came the Redman 
| Teaching settlers where the deer ran. 
(Imitation, Longfellow) 


Matched: 


Divergent: 


Not an Example: 23] "The smiles that win, the tints that glow, 
But tell of days of goodness spent. 
Maid of A vr 
Example: aid of Athens, 'ere we part, 
| Give, oh give me back my heart! 


E (Byron) 
E 
Not an el Sure solacer of human cares, 
And sweeter hope, when hope despairs! 
(Bronte) 


For the misconception program the convergent grouping was Victorian period i 

| prog j: period trochaic 

meter poetry. The selections included probability ratings of high, medium, and low. The 
nonexemplars were unmatched to the exemplars, Following is an example from that | 


Example: There they are, my fif 
Taaa] e , my fifty men and women, 
Naming me the fifty poems finished! 
(R. Browning) 


= 


INE 


Not an Example: Give every man thy ear, but few thy voice. 


This above ve to thine own self be true. 
hakespeare) 
Example: Wailing, Waili ili i 
| £, Wailing, Wailing, the wind over 
land and sea— i 
(Tennyson) 


E Convergent: 


re 


Not an Example: When I was one-and-twenty 


I heard a wise man say, 
“Give crowns and pounds and guineas, 
But not your heart away." 

(Housman) 


Directions were read aloud by the experimenter while the subjects read silently. Once 
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TABLE 2 
SconiNa SHEET 

Item M o U e 
1, eg #6 x x x x 
2, eg #16 x x o x 
8, eg #3 - sx oO [9] 
4, eg #30 x x o o 
5, eg #15 "T = [0] [9 

Item M o U G 
6. eg #12 10) x x x 
7. eg #21 (0) X re) x 
8. eg #8 o — o o 
9. eg #17 [^] x [6] [t 
10. eg #4 (0) X re) o 


Note.—Predicted responses according to condi- 
tions; M = misconception; O = overgeneraliza- 
tion; U = undergeneralization; C = correct classi- 
fication; X = subject indicates this selection is an 
exemplar; O = subject indicates this selection is 
a nonexemplar; — = subject could classify as 
either, no error possible; eg indicates an exemplar; 
eg indicates a nonexemplar; X refers to original 
test item number. 


Test 


The test was constructed so that the predicted 
Tesponses of the dependent variables could be 
analyzed. Thirty selections of poetry were se i 
into three parts with the following format: 
1, Convergent high-probability exemplar. 
2, Convergent low-probability exemplar. 
3. High-probability nonexemplar matched to 
Number 1. 

4. Low-probability nonexemplar matched to 
Number 2. 

5. High-probability nonexemplar unmatched. 

6. Divergent high-probability exemplar paired 
to Number 1. 

7. Divergent low-probability exemplar paired 
to Number 2. 

8. High-probability nonexemplar matched to 
Number 6. 

9, Low-probability nonexemplar matched to 
Number 7. 

10. Low-probability nonexemplar unmatched. 
ache thirty selections were randomly scrambled 
T at no patterns were evident to the subjects. 

9 test the dependent variable of misconception 

© grouping of Victorian period poetry was identi- 
Es VP qid all other grouping was 
a hypothesized response patterns for each of 
id Dendent variables are given in Table 2. Re- 
Pas for each subject were compared with the 
RD cted score for each dependent variable. The 

lect was scored with an error for a given 
nt variable when his response to & given item 
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differed from the predicted response. Scores were 
obtained for the three selections of the test and 
then added together for the four separate depend- 
ent Variable conditions. This procedure gave each 
subject four scores; one for each hypothesized de- 
pendent variable. 


Experimental Design 


A posttest-only control group was used in this 
experiment (Campbell & Stanley, 1963). Internal 
validity was controlled by random assignment of 
subjects to the four programs. Since the programs 
were administered to individual subjects the basic 
experimental unit was the subject. The n size for 
each treatment was 19, total n = 76. External 
validity was a problem since the subjects were not 
randomly sampled from the universal population. 


RESULTS 


Variable Measures 


Four error scores were obtained for each 
subject according to the predicted responses 
on the dependent variables (Table 2). 
Table 3 shows the treatment groups, repre- 
sented by capital letters, and the predicted 
errors for each dependent variable, that is, 
the C (classification) group would make 
zero errors under the correct classification 
variable but it was predicted that O (over- 
generalization) group would make eight er- 
rors, the U (undergeneralization) group, six 
errors, and the M (misconception) group 
would make nine errors. Thus, each group 
was predicted to make significantly fewer 
errors than the other three conditions when 
its dependent variable was analyzed. Like- 
wise, the other variations in error scores per 
group were predicted. 


TABLE 3 
HYPOTHESIZED ERROR RESPONSES 
n 


Groups 
aa DU URDU THEM E 
Dependent variable c | oļuļm{ 
Predicted errors 

c die peres aed 
Classification 0 8 6 9 
Overgeneralization 3 0 14 1 
Undergeneralization 6 14 0 9 
Misconception 9 11 9 0 


Note.—The treatment groups are represented 
by capital letters: C = correct classification; 
O = overgeneralization; U = undergeneralization; 
M = misconception. 5 
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TABLE 4 
Mean Sconzs or THe Four 
DEPENDENT VARIABLES 
Group means 
Dependent variable 
c o U M 
Classification 5.68 | 12.97 | 9.83 | 11.98 
Overgeneralization | 9.01 | 7.00 | 11.83 | 9.25 
Undergeneralization| 8.55 | 14.33 | 6.22 | 11.62 
Misconception 9.80 | 10.52 | 9.75 | 7.38 


The adjusted covariate means for the 
four treatment groups according to the de- 
pendent variables are listed in Table 4. A 
separate analysis of covariance was used 
for each dependent variable, that is, for the 
classification variable the means used were 
C (5.68), O (12.97), U (9.83) and M 
(11.98). The four covariate F tests (df = 
3/72) were: Classification, F = 16.65; Over- 
generalization, F = 12.57; Undergeneral- 
ization, F = 24.79; and Misconception, F = 
7.44. The posteriori tests used were the 
Newman-Keuls sequential test and Dun- 
can’s new multiple-range test. 


Correct Classification 


The hypothesized correct classification 
variable was constructed of matched exem- 
plars and nonexemplars, and divergent ex- 
emplars on a high- to low-probability con- 
tinuum (Table 1). On both the Newman- 
Keuls and Duncan test, the C group made 
fewer errors than the O, M, and U groups 
(p < .01). This corresponds to the hypothe- 
sis and the predicted results in Table 3. 
There was a difference between the O group 
and the U group on the Newman-Keuls (p 
< .05) and on the Duncan test (p < 01). 
According to Table 3 there was a predicted 
difference of two errors between the U 
group and the O group. No difference was 
found between the U and M groups (p > 
-05). A 3-point difference was predicted. 


Overgeneralization 


The overgeneralization dependent varia- 
ble resulted from divergent low-probability 
exemplars that were unmatched with non- 
exemplars. The multiple comparison of the 
Newman-Keuls showed a difference be- 
tween the O group and the U group (p < 
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01) ; this follows the prediction from Table 
3 of an error spread of 14 points. A differ. 
ence existed between O group and M and C 
groups of the Newman-Keuls (p « 05), 
The Duncan new multiple-range test shows 
a difference for O with U and M groups (p 
< .01), and O groups and C groups (p < 
-05). Other predicted differences from Table 
3 on both tests were between the C group 
and U group and the M and U groups (p « 
01). There was no difference between C 
group and M group (p « .05). 


Undergeneralization 


The undergeneralization treatment condi- 
tion received the independent, variables of 
high-probability divergent exemplars and 
matched nonexemplars. The multiple com- 
parisons of the undergeneralization error 
Scores show for both the Newman-Keuls 
and Duncan test that the U group differed 
from the O and M groups (p < 01). The 
predicted errors between the U and C 
groups was six (Table 3). There was a dif- 
ference between C and U on the correct 
classification analysis (p < .01), but here 
the difference was less on both tests (p < 
.05)—probably the result of the U group 
generalizing more than predicted. Other 
predicted differences were: the O group te- 
ceived a higher error mean than the other 
groups on this dependent variable, as pre- 
dicted (p < .01); the difference between 0 
group and M group was predictably lower 
p < .01); the difference narrows on the © 
group and the M group (p < .01). 
Misconception 

In the misconception treatment groups 
the subjects were instructed with conver- 
gent Victorian period, high-, medium-, an d 
low-probability exemplars with unmatehe 
nonexemplars. The results followed the pre- 
dicted variables on all factors on both tests 
(p < .01). The M group was different from 
the O, U, and G groups (p < .01). No 8& 
nificance resulted from the comparison “A 
the other three groups as predicted in Ta 
4 (p> 05). 


Discussion 


The significant results of this study hav 
implications for instructional procedures 


| 


EXEMPLAR AND NONEXEMPLAR VARIABLES 


the cognitive level of behavior. If instruc- 
tion does not include empirically founded 
principles, the student may not learn all 
that is desired by the instructor and stu- 
dent. The problems on overgeneralization, 
undergeneralization, and misconception as a 
result of faulty classification behavior in- 
struction of a concept class discussed by 
Markle and Tiemann (1969) are now more 
than a hypothetical position. 

Precise independent variables were ar- 
ranged in such a way that predicted de- 
pendent variables did result in all cases. 
Four of the 12 difference predictions (Table 
3) did not reach the .01 level, but were sig- 
nificant at the .05 level. The implications 
are clear that instruction does produce cer- 
tain types of dependent variables. If these 
are not controlled by empirically based pro- 
cedures there is little assurance that stu- 
dents learn behavioral objectives—no mat- 
ter how Magerian (1962) and precisely 
stated. 

The independent variable of probability 
tating of exemplars is crucial in instruc- 
tional design. The instance probability 
analysis did produce ratings that the author 
could not have known. Item analysis is not 
à new procedure, However, the implications 

in the context of this research have not been 
implemented on an empirically based in- 
structional system. The most significant 
difference obtained in this study was be- 
tween the undergeneralization group and 
the overgeneralization group. The undergen- 
éralization group was presented only 
high-probability exemplars and, as a result, 
Tesponded to few items on the test. On the 
contrary, the overgeneralization group re- 
ceived only low-probability exemplars and 
panded to practically every instance on 
e test. The significant results were not 
only for the main effect, but for the pre- 
dicted differences for the other effects on 
each dependent variable. 
à he independent variable of matched ex- 
à, Plar/nonexemplars in an infinite concept 
5 Ws Involved. This can be seen empiri- 
es by the increased response to nonexem- 
pe by the overgeneralization and miscon- 
fs groups. In both cases the nonexem- 
that were unmatched to the exemplars 80 
Subjects failed to recognize the rele- 
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vant attributes from the irrelevant attri- 
butes. The generalization and undergeneral- 
ization groups received matched selections 
of equal probabilities. In both cases fewer 
nonexemplars were chosen as exemplars 
than vice versa. This implies that discrimi- 
nation is more effectively taught if the 
matching of exemplars and nonexemplars is 
empirically controlled by an instance prob- 
ability rating of both exemplars and nonex- 
emplars. 

The third independent variable of the re- 
lationship between exemplars according to 
their irrelevant attribute groupings was sig- 
nificant. The three treatments of classifica- 
tion, undergeneralization, and overgenerali- 
zation all received divergent exemplars 
based on their probability ratings. Only the 
misconception group did not receive diver- 
gent exemplars. The importance of diver- 
gency of equal probability exemplars that 
differ in all irrelevant attributes as opposed 
to presenting convergent exemplars is em- 
pirieally shown by the predicted results of 
the misconception dependent variable. This 
treatment group did not choose exemplars 
which differed from those presented in the 
misconception task. 

Work on different subject matter tasks 
and sample populations will add external 
validity to the results gained here. Varia- 
bles and implications obtained in this study 
could be expanded. Areas that need investi- 
gation are: the most effective number of 
exemplars and nonexemplars; modification 
procedures to correct the three problems of 
concept instruction; the effect of the three 
problems of concept instruction; the effect 
of the three problems on high levels of cog- 
nitive learning (Merrill, 1971b); and par- 
alleling instructional strategies and inde- 
pendent variables on the higher levels of 
learning. 
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TEACHERS' MARKS, ACHIEVEMENT TEST SCORES, AND 
APTITUDE RELATIONS WITH RESPECT TO SOCIAL 


CLASS, RACE, AND SEX 


BOYD R. McCANDLESS; ALBERT ROBERTS 
Emory University 
THOMAS STARNES 
Southeastern Educational Laboratory, Atlanta, Georgia 


Eight cells (Advantagement X Sex X Race) were filled evenly by 
seventh-grade publie school subjects for intelligence, standardized 
achievement, and teachers’ academic subject marks data. Over all, 
standardized achievement accounts for about 9% of teachers’ marks 
variance. Boys receive much lower grades than girls and are also 
somewhat lower in standardized achievement. Teachers’ marks are cor- 
related modestly and positively with intelligence but not with achieve- 
ment. Consistency of direction and level of variables’ relationships 
within subgroups approaching the ideal appear only for disadvantaged 
black girls, It may be that teachers assign marks to the advantaged 
(particularly advantaged whites) according to intelligence and type 
of socialization, and to the disadvantaged (particularly disadvantaged 
girls) according to intelligence and objectively measured school 


achievement. 


When the first author was writing Ado- 
lescents: Behavior and Development 
(McCandless, 1970), he looked in vain for 
Coherent, comparable-sample applied re- 
Search that dealt with all the following 
norms and relations in such a manner as to 
be useful to the many who need such infor- 
Mation: (a) school aptitude (intelligence) 
and standardized achievement test results; 
) aptitude scores and teachers’ grades; 

^) teachers’ grades and standardized 
achievement test results. 

N dis information can be provided when 
vith ree variables are studied as they vary 
ity Tespect to sex, social class, and ethnic- 


pica] predictions about results from 
N comparisons can be made from the lit- 
"re (see McCandless, 1970, for a repre- 


sentati ; ies 
Pere Teview). These predictions are: 


1 
R ee for reprints should be sent to Boyd 
University (om Department of Psychology, Emory 
T8lly, Atlanta, Georgia 30322. 


(a) Conventional school aptitude (intelli- 
gence) measures predict standardized 
achievement test scores equally well for 
boys and girls, but better for advantaged 
than disadvantaged children. (b) Teachers’ 
marks are more accurate for girls than 
boys, when judged against the sexes’ stand- 
ardized achievement test scores; for mid- 
dle-class than for disadvantaged children; 
and are (perhaps) least accurate for disad- 
vantaged black males. (c) Teachers consist- 
ently give girls higher grades than boys, but 
there are no important differences between 
boys’ and girls’ achievement when measured 
by standard achievement tests. More tenta- 
tively, it can be predicted that disadvan- 
taged black males receive the lowest teach- 
ers’ marks of eight frequent major group- 
ings of two social classes by two sexes by 
two races. 

If these logical predictions are correct, 
their educational and social implications 
are clear: Teachers’ practices differentially 
discourage lower social-class children, and 
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particularly lower social-class males. 'These 
practices may exacerbate the frustrations 
that disadvantaged children experience in 
school and, among other things, may in- 
crease the rate of school dropout among the 
very group that most urgently needs to stay 
in school or, alternatively, may frustrate 
lower-class boys and girls in ways that re- 
sult in scholastic inefficiency when they do 
persist in school. 


Mernop 


Subjects 


Subjects were 443 Atlanta, Georgia, public 
school children for whom teachers’ marks were 
available for the first two trimesters of their 
seventh-grade year, All subjects had taken the 
California Test of Mental Maturity in the spring 
of their seventh-grade year, and all had taken the 
Metropolitan Achievement Test in the fall of their 
seventh-grade year. Hight cells were filled by ap- 
proximately 50 children each: advantaged black 
boys, white boys, black girls, and white girls; and 
disadvantaged black boys, white boys, black girls, 
and white girls. 

The advantaged or middle-class subjects at- 
tended five schools where fewer than 18% of the 
parent clientele had incomes below $3,000. Disad- 
vantaged or lower social-class subjects attended 
five different schools. For each school, more than 
47% of the parent clientele had incomes lower than 
$3,000. The five disadvantaged schools served 
poverty neighborhoods (Office of Economie Op- 
portunity criteria), and three were included in the 
Atlanta Model Cities Project. 

The final match from among the available 
schools was made by consulting two veteran 
Atlanta Public School officials who knew the city’s 
communities from long, close personal experience?. 
They guided the authors in making the closest 
possible match within the formal statistical cri- 
teria, using such criteria as incidence of fatherless 
households, working mothers, and fathers and 
mothers holding two jobs. 

Overall, the advantaged black and white schools 
are closely matched by objective and subjective 
criteria of “advantagement,” but the disadvantaged 
schools may be slanted a bit to the effect that, asa 
group, the white pupils are more disadvantaged 
than the black pupils. With a very few chance (and 
unknown) exceptions, all children were taught by 


*The authors appreciate the help of John 
Blackshear and Otis White from the administrative 
staff of the Atlanta Public Schools for their help 
with this matching, as well as the cooperation of 
Jarvis Barnes, Superintendent for Research and 
Development, Atlanta Public Schools, and the 
principals and secretaries of the schools in which 
data were gathered. 
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teachers from both races. Almost all teachers were 
female. 


Procedure 


All data were taken from pupils’ records on file 
at the 10 different schools, The authors moved from 
school to school until the eight cells listed above 
were filled, The data consisted of the following: 
(a) The total score for each subject from the Cali- 
fornia Test of Mental Maturity from the spring of 
the seventh-grade school year was the aptitude 
(intelligence) datum. (b) Using 4.0 as an A, aver- 
age teacher grades for the fall and winter trimes- 
ters of the seventh-grade year were averaged for 
reading, language, arithmetic, social studies, and 
science. (c) Metropolitan Achievement Test scores, 
converted to grade equivalents, comprise the 
achievement test data. 

Following collection of data, relevant correla- 
tions and measures of central tendency were com- 
puted, and tests of homogeneity of correlation and 
means were performed. 


RESULTS 


The most interesting facts in Table 1 are: 
First, while school aptitude (intelligence) 


TABLE 1 
CORRELATION COEFFICIENTS FOR INTELLIGENCE, 
STANDARDIZED ACHIEVEMENT, AND TEACHERS’ 
MARKS FOR THE Tora, GROUP, AND BY 
ADVANTAGEMENT, SEX, AND RACE 


r 

See ot qM 

! Standard- 

JN UL uu" 

standard- | average versus 

ized teachers’ average, 

achievement| marks teachers’ 

marks 

pigie 

Total 443 .45 .56 E 

Advantaged | 230 .20 .56 BiU 

Disadvan- 213 748 .60 .58 
taged 

Boys 221| .36 .50 .20 

Girls 222| .53 .66 39 

Blacks 225 .55 .64 B 

Whites 218| .19 .50 Do 


* All paired correlations (e.g., .20 and .74 n 
intelligence versus standardized achievement iof 
advantaged compared with disadvantaged) lor 
fer significantly from each other at the .01 level oF 
less (Snedecor, 1956, chi-square test for homoge 
neity of correlation). k at 

^ This correlation does not reach significance ^ 
the .05 level. All other correlations are signific* 
at .05 or less. 
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scores predict both standing for standard- 
ized achievement tests and teachers' marks 
fairly well (r for the total group of 443 
subjects is equal to .45 and .56, respec- 
tively), standardized achievement test re- 
sults account for less than 10% of the vari- 
ance of teachers! marks (r for the total 
group is only .31). 

Second, when the three major divisions of 
the sample are considered (advantaged ver- 
sus disadvantaged, boys versus girls, and 
blaeks versus whites), teachers with some 
consistency assign their grades according to 
intelligence test standing (Table 1, column 
3). Each member of the pairs of correla- 
tions in column 8, rows 4, 5 and 6, 7 differs 
significantly from the other (Snedecor, 
1956), but the range of differences is only 
from .16 for boys and girls to the nonsig- 
nificant .04 for advantaged and disadvan- 
taged. Maximum variance of teachers’ 
marks accounted for by intelligence is 
about 45% (all girls), minimum is about 
25% (all boys, all whites). 

Third, correlations between intelligence 
and standardized achievement test results, 
and standardized achievement test stand- 
ings and average teachers’ marks (Table 1, 
columns 2 and 4) are remarkably low for 
the advantaged, for boys, and for whites 
(range in columns 2 and 4 is from .11 to .36, 
or from only about 1% to about 13% of the 
Variance of one variable accounted for by 
the other). 

Correlations among the three school-re- 
lated variables are shown in Table 2 for the 
Advantagement x Sex, Race x Sex, and 
Advantagement x Race subgroups. 

Although the two sets of four Advantage- 
Ment X Sex, and Race x Sex correlations 
aia 3, rows 1, 2, 3, 4 and 5, 6, 7, 8 are 

i €togeneous (Snedecor, 1956), absolute 
lis Heus are not great and thus the prac- 

^! significance of the differences is slight. 
ey Tange of variance predicted for teach- 
Pia by intelligence for all the corre- 
529, ene 2, column 3) is from about 
16 2 Or disadvantaged girls) to about 

^ (for white boys). 
"à b oe between intelligence and 
striking! s achievement test standings are 
and airle ( ow for both advantaged boys 
column 2, rows 1 and 2) and for 
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TABLE 2 
CORRELATION COEFFICIENTS FOR INTELLIGENCE, 
STANDARDIZED ACHIEVEMENT, AND TEACHERS’ 
Marks ror SEX BY ADVANTAGEMENT AND 
RACE, AND ADVANTAGEMENT BY 
Race SUBGROUPS 


r 
Standart 
is Intelli. A 
Group " dicis gence du 
expe averago, Mah 
tests aahi average, 
marks 
Advantaged boys|116| .17».*«4| .57e -088: > 
Advantaged girls |114| .27 .61 155 
Disadvantaged .45 .46 
boys 105 .56 
Disadvantaged 
girls 108| .84 72 .64 
Black boys 115| .49> . 64e .45^ 
White boys 106| .08* .40  |—-.01* 
Black girls 110| .65 07 .49 
White girls 112, .30 .62 .21 
Advantaged .65 .975 
blacks 116| .45> 
Advantaged 
whites 114|— .25 56. |-.11 
Disadvantaged 
blacks 109| .62 60 62 
Disadvantaged 
whites 104 .71 55 -50 


* Does not reach significance at the .05 level. 
All other correlations are significant at .05 or less. 

» This set of four correlations is not homogene- 
ous, p X .01 (Snedecor, 1956). 

* This set of four correlations is not homogene- 
ous, p < .05 (Snedecor, 1956). 

4 In sets of four correlations (e.g., column 2, 
rows 1, 2, 3, and 4), pairs of correlations differing 
from each other by .22 or less are from a homogene- 
ous population; those differing by .25 or more come 
from heterogeneous populations at the .01 level or 
less; all others depart from homogeneity at less 
than .05 (Snedecor, 1956). 


advantaged whites of both sexes (in column 
2, row 10, the correlation between intelli- 
gence and standardized achievement is 
—.25!). 

En marks bear little relation to 
standardized test performance for either 
advantaged boys or girls (Table 2, column 
4, rows 1 and 2), for white boys or girls 
(rows 6 and 8), or for advantaged whites 
(Table 2, column 4, row 10). 

A continuation of the picture drawn 
above can be seen in Table 3. The lowest 
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TABLE 3 
CORRELATION COEFFICIENTS FOR INTELLIGENCE, 
STANDARDIZED ACHIEVEMENT, AND A 
MARES BY ADVANTAGEMENT AND RACE 
AND SEX SUBGROUPS 


L4 
cd 
Intelligence| Intelli- | achieve- 
Sem | | EM 
achicvement] teaches’ | „verus 
to = 
Advantaged 
black boys 59 +430] 78b «882 
Advantaged 
white boys 57 | —.35 56 T 
Advantaged 
black girls 57 -50 55 +82 
Advantaged 
white girls 57 | —.14 67 07" 
Disdavantaged 
black boys 56 55 E .58 
Disadvantaged 
white boys 49 47 40 +85 
Disadvantaged 
black girls 53 18 -83 73 
Disadvantaged 
white girls 55 .80 60 .52 


* Does not reach significance at the .05 level. 
All other correlations are significant at .05 or less. 

^ This set of eight correlations is not homogene- 
ous, p < .01 (Snedecor, 1956). 

* In sets of eight correlations (e.g., column 2, 
rows 1 through 8), pairs of correlations that differ 
from each other .21 or less come from homogeneous 
populations; those that differ .28 or more check out 
as heterogeneous at the .01 level or less; all other 
pairs are heterogeneous at less than .05. (Snede- 
cor, 1956). 


correlation in Table 3, column 3, is between 
intelligence and teachers’ marks for disad- 
vantaged white boys (r — .40). The figure 
for disadvantaged black boys is only 
slightly higher (r — .47). Nor for either 
racial group of disadvantaged boys is ac- 
tual achievement as judged from standard 
tests strongly reflected in teachers’ marks (r 
= .58 for black boys, .35 for white boys). 
However, the intelligence versus teachers’ 
marks for disadvantaged black girls is a 
striking .83 (about 70% of the variance of 
one variable is accounted for by the other). 

From Table 3, column 2, it can also be 
seen that intelligence and achievement test 
results are negatively correlated for both 
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advantaged white boys and girls (signif. 
cantly so for the former), but reach a high 
level in the expected direction for disadvan- 
taged black and white girls of .78 and .80, 
respectively. 

On the whole (Table 3, column 4), as has 
been noted earlier, teachers’ marks bear lit- 
tle relation to children’s standings on stand- 
ardized achievement tests, although the cor- 
relations between the two run as high as 73 
for disadvantaged black girls. However, the 
correlations are actually negative in direc- 
tion but nonsignificant for advantaged 
white boys and girls (rs = —.17 and —.07, 
respectively). 

Finally, for only one subgroup, disadvan- 
taged black girls (row 7), are the three 
correlations among the school-related vari- 
ables consistent and high in all three col- 
umns of correlations. Efficiency of predic- 
tion among the variables is next best for 
disadvantaged white girls (Table 3, row 8). 

Central tendency data for the three 
school-related variables are included in 
Table 4 for the total group and for the ad- 
vantagement, sex, and race groups. As was 
expected, the disadvantaged are lower than 
the advantaged for all three variables (al- 


TABLE 4 
MEANS AND STANDARD DEVIATIONS FOR SCHOOL 
APTITUDE (INTELLIGENCE), STANDARDIZED 
ACHIEVEMENT Tust RESULTS, AND AVERAGE 
TEACHER MARKS IN ACADEMIC SUBJECTS 
FOR Tota, Groups, AND BY 
ADVANTAGEMENT, SEX, 


AND RACE 
=m m mammam 
Average, 
noosa ’ 
Intelligence uci ade: [e 
Group n level (4.0 = A) 
E lu cou Fe E 
x |sp u |s| xu 
Total 443/90.9 |20.1/5.43 |1.102.4 |. 
Advantaged — |23009.1*|18.5/5.54* 1.19 2.5** a 
Disadvantaged|213/81.9 |17.85.12 | .952.3 |^ 
Boys 221189.4 |21.7|5.19* 1.07 2.2 4 
Girls 222192.3 18.35.48 |1.112.6 |: 
Blacks 225/84.3*19.8/5.00*|1.05]2.3 n 
Whites 218/97.6 |18.0|5.68 |1.052.8 |" 
* ¢ is significant at .01 or less. xii 


** tis significant at less than .05 but greai 
01. 
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though the disadvantaged perform closer to 
expectancy in standardized test achieve- 
ment grade level, as established by intelli- 
gence, than do the advantaged). 

There is no intelligence difference be- 
tween boys and girls, but the latter are sig- 
nificantly higher than the former in both 
standard achievement test standing and 
teachers’ marks (rows 4, 5). 

Black children fall below white children 
in standing for all three variables but, com- 
pared to whites, achieve equally close to 
expectancy in standardized achievement. 
test results (expectancy based on intelli- 
gence test scores). 

From Table 5, it can be seen that disad- 
vantaged boys, black boys, and disadvan- 
taged blacks fare markedly less well than 
any of the other groups for all three 
school-related variables, and that (although 
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not so strikingly in all instances) advan- 
taged girls, white girls, and advantaged 
whites head the list in their standings on 
the three school-related indexes. 

In Table 6 are given the central tendency 
data for intelligence, standardized achieve- 
ment, and average teachers’ marks for the 
eight Advantagement x Race x Sex sub- 
groups of the sample. While F for each of 
the three columns of means is significant at 
less than the .01 level, no difference between 
any pair of means in columns 4 and 6 is 
significant (Scheffés test; Hays, 1963). 
However, inspection of the rows and col- 
umns in Table 6 adds information to that 
obtainable from the grosser groupings of 
subjects, as reported in Tables 4 and 5. 

It can be seen from Table 6, rows 1, 2, 3 
and 4 (the advantaged groups), that teach- 
ers mark boys, regardless of race, more se- 


TABLE 5 
Mans AND STANDARD DEVIATIONS FOR SCHOOL APTITUDE (INTELLIGENCE), STANDARDIZED 
ACHIEVEMENT Test RESULTS, AND AVERAGE TEACHER MARKS FOR SEX BY 
ADVANTAGEMENT AND RACE, AND ADVANTAGEMENT BY RACE Suscroups 


T 


Group s 
Advantaged boys 116 
Advantaged girls n 
Disadvantaged boys 105 
Disadvantaged girls 108 
Black boys 115 
White boys 106 
Black girls 110 
White giris 112 
Advantaged blacks 116 
Advantaged whites 114 
Disadvantaged blacks 109 
Disadvantaged whites 104 


* For the set of four means in this column, employing 


Intelligence Wy IM usn 

x SD M E M SD 
99.4* 19.4 5.41> | 1.28 2.20 .80 
98.7 17.6 5.67 1.14 2.7 «80 
78.2 18.4 | 4.95 81 | 2.1 | .80 
85.5 16.5 5.29 1.05 2.5 .80 
83.04 22.4 4.88* .99 2.1 |. .70 
96.2 18.6 5.53 1.07 2.2 .90 
85.6 16.7 5.12 1.10 2.4 .80 
98.8 17.5 5.83 1.01 2.8 .80 
92.7! 18.2 5.20» | 1.20 2.44 | .70 
105.5 16.5 5.81 1.12 2.5 .90 
15.4 17.5 4.73 S 2.1 70 
88.8 15.5 5.54 95 2.5 .80 


the Scheffé test, means that differ by 12.1 or 


more are significantly different at the .01 level or less. Overall F significant at less than .01. 


i For this 
955. Overall F significant at less than .01. 


e 


l Lis 
ess. Overall F significant at less than .01. 


For the set of four means in thi loying the 

2 ns in this column, employing F 

More are significantly different at the .1 level or less. Overall F signifi 
or this cell, by Scheffé test, means differing by .67 or more 
or the set of four means in this column, employing the Boheffé test, means that differ by 11.4 ormore 
e po icantly different at the .01 level or less. Over 
leyei, tPS cell, by Scheffé test, means differing by 


K ; 
88. Overall 7 Significant at less than .01. 


are sj; 


cell, by Scheffé test, means differing by .72 or more differ si 


For this cell, by Scheffé test, means differing by .5 or more 


gnificantly at the .01 level or 
differ significantly at the .01 level or 
Scheffé test, means that differ by 12.7 or 


ificant at less than .01. 
differ significantly at the .01 level or 


ignificant at less than .01. 
idonei. from each other at the .05 
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TABLE 6 
MEANS AND STANDARD DEVIATIONS FOR SCHOOL APTITUDE (INTELLIGENCE), STANDARDIZED 
ACHIEVEMENT TEsT RESULTS, AND AVERAGE TEACHER MARKS FOR 
ADVANTAGEMENT BY RACE BY Sex SuBGROUPs* © 


Achievement test 


Intelligence grade level een ah 
Group n 

M SD M SD M SD 
Advantaged black boys 59 92.6^ 20.1 5.05: 1.13 2.34 .60 
Advantaged white boys 57 106.5 16.1 5.78 1.23 2.2 1.00 
Advantaged black girls 57 92.9 16.3 5.58 1.24 2.6 .80 
Advantaged white girls 57 104.6 17.1 5.85 1.00 2.8 .80 
Disadvantaged black boys 56 73.0 20.5 4.71 .78 2.0 .T0 
Disadvantaged white boys 49 84.8 13.6 5.23 .75 2.2 .80 
Disadvantaged black girls 53 77.9 13.3 4.74 78 2.8 B 
Disadvantaged white girls 55 92.9 16.0 5.81 1.02 2.7 -70 


a F for each column of eight means is significant at less than .01. 


» Pairs of means in this column that differ from each other by 27.5 points or more differ at the .01 
level or less (Scheffé test); those differing 24.1 to 27.4 points or more, at less than .05 but greater than 


01. 


° In this column, means 1.65 or further apart differ at the .01 level or less from each other (Scheffé); 


those 1.45-1.64 at less than .05 but greater than .01. No individual pairs differ significantly. 


a Tn this column, means 1.3 or further apart differ at the .01 level or less from each other (Scheffé); 


those 1.1-1.3 at less than .05 but greater than .01. No individual pairs differ significantly. 


* It should be noted that these data come from achievement tests given early in the fall term of school, 


soon after the end of the summer recess. 


verely than girls of either race, even though 
(as in the case of the advantaged white 
boys and girls) there is neither a statisti- 
cally nor a practically significant difference 
in accomplishment as measured by stand- 
ard achievement tests, and no difference in 
intelligence (mean IQ for advantaged white 
boys is 106.5; for advantaged white girls, 
104.6). 

Among the four advantaged groups, black 
boys are clearly worst off in achievement 
test standing, even though they stand at 
almost exactly the same intelligence level as 
advantaged black girls. 

The picture for the four disadvantaged 
groups is not greatly different, except that 
disadvantaged black girls join disadyan- 
taged black and white boys in the low intel- 
ligence, low achievement test mean, and low 
average teachers’ marks category. The dif- 
ferences favoring disadvantaged white girls 
over the other three disadvantaged groups 
are consistent for all three school-related 
variables, and seem large enough to be of 
practical significance, particularly when the 
moderate strength and consistency of corre- 

lations for this group among intelligence, 


achievement test standing, and teachers’ 
marks are considered (see Table 3, row 8). 
For the eight subgroups of the total sample, 
this level of consistency and strength of in- 
terrelations is exceeded only for the disad- 
vantaged black girls, but for them the ac- 
curacy-consistency phenomena are linked 
to relatively low teachers’ marks (Table 3, 
row 7). 


Discussion 


First, it was hypothesized that a conven- 
tional group intelligence test will predic 
standardized achievement test results 
equally well for boys and girls, but better 
for advantaged than disadvantaged chil- 
dren. Results fail to bear out the hy- 
potheses, strikingly so in the case of the 
advantagement-disadvantagement predic- 
tion: Girls’ standardized test results até 
predicted better than those of boys, a 
disadvantaged children’s results are pre 
dicted strikingly better than advantage 
(for advantaged, 4% of the variance as 0p- 
posed to about 55% for disadvantage 
Additionally, standard achievement 
performance of black children was Pte 
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dicted much better by intelligence than it 
was for whites (about 30% of the variance 
for black children, about 3% for white chil- 
dren). 

E the advantaged, for boys, and for 
whites in this sample, factors other than 
California Test of Mental Maturity IQ ac- 
count for most of the variance of school 
accomplishment as measured by the Metro- 
politan Achievement Test. Speculatively, 
factors responsible may be attention, moti- 
yation, rapport with teachers, conformity, 
or any one of many other things that may 
possibly be correlated with intelligence for 
the disadvantaged, for girls, and for blacks, 
but not for the advantaged, for boys, or for 
whites. In any event and for whatever rea- 
son, tests of achievement and intelligence 
have very much more in common for a dis- 
advantaged than for an advantaged Atlanta 
population, and substantially more in com- 
mon for a black than a white population. 

The second hypothesis was that teachers’ 
marks, when judged against achievement 
test results, will be more accurate for girls 
than for boys, for middle-class than for dis- 
advantaged children, and perhaps least ac- 
curate of all for disadvantaged black boys. 

The prediction was supported modestly 
for girls versus boys (about 15% of the var- 
lance for girls, only 4% for boys). For nei- 
ther sex is prediction satisfactory in the 
Useful-accurate sense. The prediction was 
Seriously awry in the case of social class 
(only about 1% of the variance for advan- 
laged, about 34% for disadvantaged). 
Teachers’ marks were related moderately to 
accomplishment as judged by standardized 
achievement test results for disadvantaged 
black boys (r = .53). The groups for which 
no relation was shown between teachers’ 
marks and achievement test results were 
advantaged white boys (r = —.17) and ad- 
duntaged White girls (r — —.07). Specula- 

ively, it may be that teachers mark the 
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advantaged and the white according to how 
well socialized they are according to teach- 
ers’ standards, but mark the poor and par- 
ticularly the black poor more according to 
their objective performance. It seems that 
neither group is altogether well served if 
this should prove to be the case. 

The third set of hypotheses was that 
teachers consistently give girls higher 
marks than boys (this is clearly and con- 
sistently borne out by the data) ; that there 
are no important differences between boys 
and girls in their standard achievement test 
performance (not borne out by the data: 
Girls consistently performed better than 
boys); and that disadvantaged black boys 
will receive the lowest teachers’ marks of 
any group (borne out by the data; but they 
were also lowest in intelligence and per- 
formance on standardized achievement 
tests. However, in standard tests of 
achievement, disadvantaged black girls 
were at almost the same low level). 

Finally, the results provide a sobering 
caution about trying to predict anything for 
a special subgroup from the general popula- 
tion relationships: For example, intelligence 
and standardized achievement test standing 
are correlated .45 for the total population 
for this study, but —.35 for advantaged 
white boys and .80 for disadvantaged white 
girls. Similarly, correlations between teach- 
ers’ marks and achievement test standings 
range from —.17 for advantaged white boys 
to .73 for disadvantaged black girls. To say 
the least, caution about special group pre- 
diction is indicated. 
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EFFECTS OF CATEGORIZATION, DEGREE OF BILINGUALISM, 
AND LANGUAGE UPON RECALL OF SELECT 


MONOLINGUALS 


MICHAEL 


University of Northern Colorado 


A free-recall procedure, utilizing categorized and noncategorized word 
lists in English, Spanish, and a mixed condition, was used with three 
groups of Spanish-English bilinguals and a monolingual-English 
group, The amount of recall across all lists was greater for a cate- 
gorized than for a noncategorized condition. The perferred language of 


recall and clustering was English, 


bilingualism. The poorer performance in Spanish was interpreted as a 
state of perceptual unreadiness, which was shown to create “interfer- 
ence” for the subjects when they were presented with a task requiring 
simultaneous switching between English and Spanish, 


The argument has often been made that a 
language is a coding scheme for its speak- 
ers, of such a kind that speakers of different 
languages categorize “standard” objects of 
the environment differently, the manner 
varying with the categories available in the 
language being used (Brown, 1958). It is 
readily acknowledged, however, that a lan- 
guage is not just a set of coding categories; 
it consists of categories of words and the 
rules for joining them intelligibly. The com- 
bination of words and rules form a linguis- 
tic system, a program for dealing with the 
environment (Miller, Galanter, & Pribram, 
1960). A program such as this provides a 
means of identifying objects and events and 
reduces the necessity of constant learning 
(Bruner, Goodnow, & Austin, 1956). 

A bilingual, a person who has varying 
degrees of skill in two languages, needs cat- 
egories and the rules for joining them for 
both of his languages. Many times the cate- 
gories and the rules overlap; but many 
times they do not, which necessitates the 
learning of two independent linguistic sys- 
tems for the bilingual. He must learn ap- 
propriate categories in each language, and 
he must learn the cues useful in placing ob- 
jects appropriately in his two systems of 


* Requests for reprints should be sent to Michael 
Palmer, Department of Psychology, University of 
Northern Colorado, Greenley, Colorado 80631. 


AND BILINGUALS 
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regardless of the group’s degree of 


categories. When one of these systems in- 
fringes upon the other, “interference” may 
result. If one of the bilingual’s languages is 
not as structured as the other, then the pos- 
sibility of interference arises. This interfer- 
ence comes from highly accessible cate- 
gories in one language that serve to block 
alternative, less accessible categories in the 
other language (Bruner, 1957). Interference 
could be viewed, then, as perceptual un 
readiness in one or both of the bilingual’s 
two languages. 

In general, the free-recall literature sug- 
gests that subjects recall fewer items from à 
list of unrelated words than from a list 
comprising words that can be easily 
grouped into some form of category (Bous- 
field, 1953; Cofer, 1966; Cohen, 1963; 
Cohen & Bousfield, 1956; Marshall & 
Cofer, 1963; Tulving, 1962). à 

This study was an experimental investi- 
gation of the differential effects of categori- 
zation, the degree of bilingualism, and lan- 
guage upon recall. Amount of recall and the 
extent of category clustering were use 
assess the relative amounts of “interfer- 
ence” among bilingual groups. 


Meruop 
Subjects 


The subjects were assigned to four subpopula- 
tions on the basis of their degree of bilingualis™ 
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. All subjects were elementary School children in the 
fifth, sixth, seventh, or eighth grades. The subpopu- 
lations were designated as (a) strong English, (b) 
strong Spanish, (c) balanced English-Spanish, and 
(d) monolingual English. 

The degree of bilingualism was determined in 
two ways: (a) a self-report technique and (b) a 
measure of difference in reaction time. The sub- 
jects were asked to rate themselves as either domi- 
nant in Spanish, dominant in English, or equal in 
ability in both, All fifth-, sixth-, seventh-, and 
eighth-grade students were given questionnaires. 
On the basis of the questionnaire, subjects were as- 
signed to one of the four groups: monoli 
English, strong English, strong Spanish, and bal- 
anced English-Spanish. Ten subjects were then 
chosen at random by means of a table of random 
numbers from each of the four groups. As the 
monolingual-English group spoke no Spanish, only 
the self-report technique was used in assessing 
their degree of bilingualism. 

The second technique involved the difference in 
mean reaction times between pictorial identifica- 
tions in English and Spanish. The picture-identifi- 
cation subtest of the Stanford-Binet Form L-M 
(Terman & Merrill, 1960) was used. The subjects 
Were asked to name one-half of the 18 items or 9 
items in English and 9 items in Spanish. The items 
selected for identification in either language were 
selected on a random basis to insure equal diffi- 
culty levels between the two languages. Before 
each Picture was shown, the subject was told in 
which language to respond. Each trial was timed, 
and timing was begun as the picture was turned, 
and stopped as the subject responded, The reaction 
times were recorded to the nearest one-hundreth 
E 4 second. The means of the nine trials in each 
‘anguage were compared. Each subject who was 
5% or more slower reacting in Spanish was con- 
sidered to be English dominant, and those who 
reacted slower in English by 25% or more were 
ROSE to be Spanish dominant. All subjects 
M i were less than 25% slower in reacting in 
in i language were considered as being balanced 
tech e two languages. The results of the self-report 
Ga and the reaction time technique were 
N pared to assure reliability of groupings. The 
J an differences in reaction time were found to 

Bree identically with the self-rating scale. 


Materials 


vont naterial to be learned consisted of two 
of 40 ts. The categorized (C) list was composed 
a cords belonging to four semantic categories. 
ses were relatives, colors, four-legged 
gory reds parts of the human body. Each cate- 
approxi luded 10 different high-frequency words of 
Noh rug the same frequency taken from the 
a aa by Battig and Montague (1969). 
order by poner’, assigned to the list in a random 
this Wi. means of a table of random numbers. As 
once in list was to be presented once in English, 

Spanish veah, and once in a mixture of one-h 
and one-half English, a separate randomi- 


E 
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zation was prepared for each presentation. The de- 
cision as to which language a word on the mixed 
list would represent was also determined by a table 
of random numbers until 10 in each language were 
chosen. 

The second basic word list was noncategorized 
(NC). List NC was composed of one word from 
each of 40 different semantic categories, and was to 
be presented in the three language conditions also. 
The order in which the words were presented in 
each language condition was separately determined 
by & table of random numbers. For the mixed list, 
the words to be given in either English or Spanish 
were selected by a table of random numbers after 
placement on the list, with the restriction that the 
mixed-language conditions of both List C and 
List NC have no more than four consecutive words 
in the same language. 

Each of the six word lists, the three language 
conditions of List C and the three language condi- 
tions of List NC, were separately recorded on 
audiotape for presentation. The audiotapes were 
prepared with 2-second intervals between words. 
The Spanish lists were recorded by a native 
speaker of the local conversational dialect. 


Procedure 


All subjects were orally instructed by prere- 
corded directions in both Spanish and English. 
They were told to listen carefully to the words to 
be presented, as they would be asked to remember 
as many of them as they could. 

The word lists were presented one at a time. 
The order of presentation of the lists was random- 
ized for each subject by means of a table of ran- 
dom numbers. This way the orders of presentation 
(English-Spanish-mixed; English-mixed-Spanish ; 
Spanish-English-mixed ; Spanish-mixed-English ; 

ixed-English-Spanish ; 
varied for each subject. 

The presentation of each list was followed by a 
2-minute recall period during which the subject's 
responses were tape recorded. The responses were 
later transferred to a data sheet for tabulation. 


Data Analysis 


Recall was tabulated for all subjects, and the 
extent of clustering in categories and languages 


as then calculated. 
" The statistical strategy consisted of five separate 
factorial analyses of covariance. Co used 
were (a) an index of socioeconomic status, (b) 
vocabulary level, and (c) the age in months of the 


ibjects. 
2 The index of socioeconomic status used was the 


wo-Factor Indez of Social Position (Hollings- 
rea 1957), which was based on the level of edu- 
cation and the occupation of the head of the house- 
hold. To determine the vocabulary level of the 
subjects, the vocabulary subtest of the Wechsler 
Intelligence Scale for Children was used (Weche- 
ler, 1949). The use of these variables as covariates 
was intended to reduce potential bias in the re- 


sults. 
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The five analyses of covariance used the four 
groups as one of the two factors; the other factor 
varied for each analysis. The five factors were 
these: (a) extent of language clustering under 
List C and List NC, (b) extent of category clus- 
tering on each of the three language situations, 
(c) amount of total recall under List C or List NC, 
(d) amount of recall under List C for each of the 
three language situations, and (e) amount of re- 
call under List NC for each of the three language 
situations. These five analyses were intended to 
reveal differences between the groups on each 
measure and between the other five factors, as 
well as the interaction between the groups and the 
other factors. 


ResuLTS 


The correlation matrix for the three co- 
variate control variables and the dependent 
variables for the groups under List C is 
shown in Table 1. As can be seen, the corre- 
lations among the covariate control varia- 
bles are low, expected, and indicates little 
overlap between the variables. The correla- 
tions of each of the covariate control varia- 
bles with each of the dependent measures 
are also low, suggesting that the relation- 
ships between each of these pairs of varia- 
bles are not strongly aligned. The correla- 
tions among the dependent variables show a 
trend toward significance at the .05 level, 
and was to be expected as, in actuality, all 
of the dependent measures reflect a verbal 
component. Several of the measures reflect 
the partitioning of the recall component and 
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represent the correlations of recall in ong 
language with recall in another language, 
which for bilinguals would be expected to 
be relatively high. 

The correlation matrix for the three co- 
variate control variables and the dependent 
variables for the groups under List NC js 
shown in Table 2. The correlations among 
the covariate control variables were not as 
low as expected, suggesting overlap among 
the variables. There is a trend toward sig- 
nificance at the .05 level in the correlations. 
between the covariate control variables and 
the dependent variables. In general it can 
be said that the older the subject, the higher 
the recall. Again, the correlations among 
the dependent variables are relatively high, 
reflecting the effects of partitioning the re- 
call measures. 

There were no significant F values at the 
:05 level for groups, language situations, or 
for interaction in the extent of language 
clustering. This means that none of the 
groups used language as a means for clus- 
tering significantly more than any other 
group, and also that the amount of lan- 
guage clustering was not used significantly 
more under either List C or List NC. 

Tn the extent of category clustering, there 
were significant differences among the 
groups (F = 3.93, df = 3/45, p < .05), and 
also significant differences among the lan- 


TABLE 1 
INTERCORRELATIONS OF COVARIATE CONTROL VARIABLES AND DEPENDENT VARIABLES 
FOR GROUPS UNDER List C 


(n = 20) 


SS 
Variable 

1. Age 

2. Vocabulary 

3. SES 3 

4. Total recall 3 t 

5. E recall F u 

6. S recall F ‘ t 

7. M recall :21 —.16  .23 
8. E category clustering —.07 —.26 —.13 
9. S category clustering 16 —.47* .84* 
10. M category clustering —.04 —.06 .02 
11. M language clustering 32 —.06 .37 


.65* —.65* 
52* .78* .23  .30 
.78* ..38 —.92* .50*  .23 
.68*  .57*  .45* ..80*  .44* 97 30 
-53* .88  .89  .07 — 1042: 


Note.—Abbreviations: E = English; S = Spanish; M = mixed; SES = socioeconomic status. 


* p < .05. 
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TABLE 2 


INTERCORRELATIONS OF COVARIATE CONTROL VARIABLES AND DEPENDENT VARIABLES 
FOR GROUPS UNDER List NC 


(n = 20) 
Variable 
Variable 
toc [ipa 3 | T dE | 6 | 7 
1. Age 
2. Vocabulary —.46* 
3. BES .27 —.74* 
4, Total recall (55%) 14 -19 
6. E recall .26 .08 -10 .85* 
6. S recall .5b* | —.48 .49* .81* .55* 
7. M recall .50* .07 —.20 .65* .92 .29 
8. M language clustering .81 .26 — .49* .45* +29 .02 .82* 


Note.—Abbreviations: E = English; S = Spanish; M = mixed; SES = socioeconomic status. 


*p « .05. 


guage situations (F = 10.61, df = 2/45, p 
< 01). However, there was confounding 
between these two factors, as the monolin- 
gual-English group’s performance in Span- 
ish was so poor that it was difficult to parti- 
tion the effects of the two factors. 

For total recall, there was no significant 
F value at the .05 level for groups. There- 
fore, it was inferred that the groups did not 
differ significantly in the amount of recall. 
The recall between List C and List NC dif- 
fered significantly (F = 8.99, df = 1/29, p 
< 01), indicating that the total recall for 
List C was significantly higher than the 
total recall for List NC. 

Nise! the categorized condition, using 

e Newman-Keuls test for multiple-group 
ae a found that the recall 

e strong-Spanish group was signifi- 
WEE higher than the recall for the mono- 
gita English. group (q « .05). Overall, 
(P XN Were significant differences for groups 

= 3.46, df = 3/45, p < .05) and lan- 
pa Situations (F = 9.81, df = 2/45, p < 
01) under the categorized condition. The 
SQL among all other pairs of groups 
tesi ànguage situations were not found to 

gnificant, 
varier the noncategorized condition, there 
M s adrificanoe for language situations (F 
dies ag = 2/45, p < .05). However, in 
hse 9n all ordered pairs of means for 
any She Situations, significance between 

the pairs was not achieved. 


Discussion 


The data indicate that categorization 
does facilitate recall. For each group, recall 
under the categorized condition exceeded 
recall under the noncategorized condition. 
This finding is in no way surprising and 
agrees with the majority of the literature 
surveyed prior to this research. However, 
when the total recall was analyzed by its 
components (recall in each of the language 
situations), the result was somewhat unex- 
pected. It was expected that the monolin- 
gual group would have superior recall in 
English, as compared with the bilingual 
groups, and in actuality the recall of the 
monolinguals was the poorest (Table 3). 

An explanation for these results was dis- 


TABLE 3 


Apsustep CELL AND MangrNAL MEANS FOR 
RECALL IN THREE LANGUAGE SITUATIONS 
UNDER!CATEGORIZED CONDITION 


Monolingual 
English 1 
Strong English |13 
Strong Spanish |16 
Balanced 15 
Xx 14 


boo SI bo wo 
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covered when reviewing the raw data. When 
viewing the socioeconomic status levels of 
the cells, it was discovered that in effect the 
relative performance of the groups was in- 
versely related to their socioeconomic status 
level. That is, the higher the socioeconomic 
status of the cells, the poorer the perform- 
ance of that cell. 

Jensen (1969) discussed two levels of 
learning: associative and conceptual. Asso- 
ciative ability is tapped mostly by such 
tests as digit memory, serial rote learning, 
and free-recall types of activity. His find- 
ing was that lower socioeconomic status 
children surpass middle socioeconomic chil- 
dren of approximately the same IQ in per- 
formance on associative tasks. The results 
of recall as analyzed by individual compo- 
nents agree with this finding of Jensen. 

Recall was superior in the English-lan- 
guage situation for all the groups. The su- 
perior recall in English occurred in both the 
categorized and noncategorized conditions, 
and the results were as expected. All formal 
instruction for the students had been in 
English. Their Spanish instruction had been 
informal and had occurred at home. Earlier 
"interference" was described as the case in 
which the perceiver had a set of categories 
inappropriate for adequate prediction of his 
environment (Bruner, 1957). The poorer 
performance in Spanish for all the groups is 
interpreted as interference. The extent of 
category clustering was higher in English 
than in the other language situations. As 
English is the more highly structured lan- 
guage, its categorizations are more highly 
accessible and serve to block those less ac- 
cessible categories in Spanish. This also, 
therefore, constitutes interference, 

It not only has been suggested that se- 
mantic categorization facilitates recall, but 
that the relative degree of bilingualism does 
not significantly affect recall. These results 
suggest that the subjects’ performance is 
not lowered because of language ability, 
and that the amount of recall and the ex- 
tent of category clustering can be used as a 
reflection of linguistic independence. 

The implications of these results for bi- 
lingual education programs are important. 
One of the assumptions on which the bilin- 
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gual education programs are based is that a 
child is to be penalized by instruction en- 
tirely in English. The child is then placed in 
a bilingual program so as to enable him to 
use his “stronger” language to facilitate the 
learning of his “weaker” language. Spanish 
has always been assumed to be the stronger 
language and English the weaker one. The 
results of this study suggest that the reverse 
is true. Therefore, are bilingual education 
programs helping the student, or are they 
causing interference within the student’s 
two language systems? 
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STRUCTURAL DIFFERENCES BETWEEN LEARNING 


OUTCOMES PRODUCED BY DIFFERENT 
INSTRUCTIONAL METHODS: 


RICHARD E. MAYER? ann JAMES G. GREENO 
University of Michigan 


The concept of a binomial probability was taught using a method that 
emphasized calculating with the formula, and a method that empha- 
sized the meanings of the variables in the formula. Learning outcomes 
were tested using items of four kinds, including calculation for new 
problems and questions about general properties of the formula. Large 
interactions in transfer performance were obtained in three cases, indi- 
cating that the two methods produced structurally different learning 
outcomes. Results were interpreted in relation to a hypothesis that 
cognitive structures can vary in the connectedness that components 


have with each other (internal connectedness) and with other elements 
of the subject’s knowledge (external connectedness). 


The present study attempted to identify 
some of the consequences of varying two 
aspects of the procedure used for teaching a 
mathematical concept. The mathematical 
concept that was taught is the probability 
of obtaining r successes in N Bernoulli 
ttials—that is, the concept of the binomial 
distribution. One aspect of instructional 
Procedure that was varied is sequencing; 
Some subjects began by learning about the 
Component variables of the formula (the 
Concepts “trial,” “success,” “probability of 
X d etc.) and gradually learned to put 
qu. together, while other subjects were 
p Introduced to the complete formula 
que gradually learned how the compo- 

A variables figured in using the formula. 
2 y p aspect of the procedure that was 
ni a | is the degree to which the subject’s 
an pe during the learning session was 
i d ured by requirements to work prob- 
it didis: 
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Mee research was supported by the Advanced 

49(638).4 Projects Agency under Contract AF 
CR 786 with the Human Performance Center, 
1 pr of Michigan. 

a Pore for reprints should be sent to Rich- 

Po Fu ayer, Department of Psychology, Human 

Packar BS Center, University of Michigan, 330 

Road, Ann Arbor, Michigan 48104. 


An idea that motivated this study is that 
different instructional procedures may re- 
sult in learning outcomes that are qualita- 
tively or structurally different. Most com- 
parisons between instructional procedures 
ask which procedure results in more learn- 
ing, or more efficient learning, of skill or 
knowledge that is taken as being qualita- 
tively the same in all cases. However, many 
theoretical considerations are encouraging 
for the possibility that appropriate varia- 
tion in teaching procedure would lead to 
skills or outcomes with different structural 
properties. For example, Wittrock (1963) 
pointed out that subjects who learn with 
different procedures perform different re- 
sponses in the process, and different re- 
sponses can be expected to produce different 
outcomes of learning. Roughead and Scan- 
dura (1968) argued that the task demands 
presented by different instructional proce- 
dures could lead to the subjects’ learning 
different systematic patterns of behavior, or 
rules, because the use of some rules might 
be required to successfully complete one in- 
structional program but not another. From 
a somewhat different point of view, Piaget 
(1970) and Ausubel (1968) have asserted 
that new learning involves development of 
cognitive structure that results from assimi- 
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lating new ideas and accommodating exist- 
ing structures. According to this idea about 
learning, different instructional procedures 
could activate different aspects of existing 
cognitive structure. And since the outcome 
of learning is jointly determined by the new 
material and the structure to which it is 
assimilated, the use of different procedures 
could lead to the development of markedly 
different structures during the learning of 
the same new concept. 

The technique used in the present study 
to identify differences between outcomes of 
learning was a series of transfer tests re- 
lated to different aspects of knowledge that 
a subject might have acquired during the 
learning program. We would be especially 
interested in a pattern of results showing 
that subjects receiving one training proce- 
dure performed better on some kinds of 
transfer tests but worse on other kinds of 
tests compared to subjects who had a dif- 
ferent training procedure. Such a result 
would refute the idea that the procedures 
led to outcomes differing only in the 
amount of learning. At the least, such a 
result would indicate that subjects learning 
with one procedure had acquired more of 
one aspect of knowledge involved in the 
teaching, and less of some other aspect, 
than subjects who learned with the other 
procedure. 


ExPERIMENT I 


This study compared the effects of learn- 
ing the binomial distribution concept using 
two methods, one beginning with a state- 
ment of the formula and emphasizing the 
mechanical operations involved in using it 
(Group F) and the other beginning by re- 
lating the variables in the formula to con- 
cepts that presumably were part of the sub- 
ject’s general knowledge (Group G). Both 
methods used in this experiment were rela- 
tively unstructured by demands for a sub- 
ject to work problems. Learning was evalu- 
ated using four types of test problems: (a) 
familiar problems (Type F) which were 
stated in the same way as example prob- 
lems given during training; (b) problems 
requiring a transformation (Type T), 
usually of an algebraic nature, to be put 
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into the familiar form; (c) unanswerable 
problems (Type U) which looked like fa. 
miliar problems but actually set up incon- 
sistent or otherwise impossible conditions; 
and (d) questions (Type Q) where the sub- 
ject was required to give a property of the 
formula or a constraint on situations in 
which the formula can be applied, rather 
than a computational answer. 


Method 


Subjects and design. Twenty female subjects 
were recruited in the fall term from a pool of Uni- 
versity of Michigan students who volunteered to 
participate in psychological experiments for pay. 
Each subject served in one cell of a 2 X 2 X 5 fac- 
torial design. One factor was the method of in- 
struetion (Group F or Group G). The other factors 
were seta of specific test items and orders of pre- 
senting the different types of test items. Every 
subject received test items of all four types, 80 
the comparison between types of test items was à 
within-subject comparison. 

Materials. The two instructional methods were 
incorporated into typewritten teaching booklets. 
Both groups learned the concept expressed by the 
formula, 


PZ =r|N) = (*) pa - p», 


where N is the number of trials, r is the number of 
successes, p is the probability of success, am 
P(X = r | N) is the probability of r successes in N 
trials. The teaching booklets were of about equal 
length, and material in both was presented ina 
series of pages, each containing a brief exposition, 
followed by an example using the ideas just ex- 
plained and an exercise for the subject to work oy 
The ideas given in the booklet for Group 
WE G » out- 
1. Introduction of the terms "trial," “ou 
come," and “event.” ou 
2. Notion of a success introduced as some desig 
nated event. í di 
3. The probability of a success explained a8 A 
proportion of possible outcomes leading to success, 
or as expected frequency of success over 
series of trials. 3 
4. Use of notation: N for number of trials, 
number of successes. 
5. Coordination of N, r notation to a Spec 
Sequence. 
6. The number of ways to get r success 


trials given as G , with an explanation of 


to calculate the value of this coefficient. | , ofa 
7. Presentation of the fact that probability s 
specific sequence is the product of the probal er of 
of individual outcomes and calculation in te 
p and 1 — p. ? 
8. Probability of r successes in N trials 


a long 
r for 
ifie 
es in N 
how 


given té 


STRUCTURAL DIFFERENCES BETWEEN LEARNING OUTCOMES 


the product of the probability of a specific se- 
quence by the number of such sequences, and pre- 
sentation of the binomial formula. 

The ideas given in the booklet for Group F 
were: 

1. Presentation of the formula, and definition 
of its terms: P(X = r | N) as the probability that 
a certain outcome occurs r times in N trials, r as 
the number of times the thing happens, N as the 
number of times we try to get it to happen, and p 
as the chance of the thing happening on each 
trial. 

2. Instruction that P(X = r | N) is found by 


multiplying three quantities: E 25, 0— p)”. 


3. Instruction that to find (2), use a formula 


given in terms of factorials, and explanation of 
these terms. 

4. Presentation of the fact that p' is p multi- 
plied by itself r times. 

5. Instruction in obtaining (1 — p)N-* from the 
quantities p, V, and r, and in finding the value of 
this term. 

6. Instruction that to find P(X — r | N), first 
find the three component terms given above and 
multiply them to obtain the result. 

As the reader may see from the outline given 
above, the material given Group G included more 
discussion of concepts, while the booklet for Group 
F had the character of a set of instructions and 
tould be likened to a computer program for finding 
a numerical answer. 

Examples of the test questions and answers are: 

Familiar (F): N = 6, r = 3, p = ¥. Find 
P(X = r | N). The solution requires plugging the 
values for N, r, and p into the formula to get 


"run = (JGG) = ere 


Transformed (T): p = = -— 
à ip = i, N= p =%, N 

jn 0. Find P(X = r | N). The solution requires 
E à Subject to solve simultaneous equations for N 
a s; d and then plug the values into the formula 


rin (GI) ex 


Unanswerable (U): N = 2,r = 3, p = 3. Find 
nition th | N). The correct answer requires recog- 
ceed NV at r, the number of successes, cannot ex- 
ken the number of trials; hence the subject 
Fe nter no answer or a similar response on the 
Witing eet” Incorrect answers usually involved 

ng out a formula such as 


P(X =| N) = QOO- Mj: 


jets (Q): Can r be greater than N? The cor- 

umber Ned also requires recognition that r, the 

of trials. Hees oe cannot exceed NV, the number 
; ence the subject should reply no. 
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A test set included three questions each of 
Types T, U, and Q, and six questions of Type F 
that were divided arbitrarily into two subsets 
called Fl and F2 for purposes of determining pre- 
sentation order and for statistical analysis. Each 
subset had three items of a single type, including 
a problem stated in terms of N, r, and p (such as 
those given above), a problem about flipping coins, 
and a problem about rolling dice. (All of the ex- 
amples and questions for the subject given in the 
teaching booklets were stated in terms of Niit, 
and p or were about flipping coins.) Two separate 
sets of test items were constructed. 

Additional materials used in the experiment in- 
cluded a pretest consisting of a set of simple arith- 
metic problems designed to determine whether the 
subject had sufficient skill in computation to 
master the material in the teaching booklets. The 
pretest consisted of 10 items: 2 dealt with multipli- 


5-4 


cation and division of integers (e.g., DPI - 


—— 2; 4 dealt with raising a fraction, decimal, or 
3 
integer to a power (e.g., G) = —_); and 4 


dealt with multiplication of fractions or decimals 
(e.g., 3g X 14 = —_). The pretest also included 
questions about statistics and mathematics 
courses taken by the subject. A postexperimental 
questionnaire was also given, soliciting the sub- 
ject’s comments and asking the subject to report 
previous experience with the material presented 
during the experiment. 

Procedure. Subjects were run in groups of four. 
First, the pretest was given. Then the experi- 
menter gave a brief explanation of the experiment. 
Each subject was then given a teaching booklet, 
with different subjects in each session receiving 
different booklets. 

Subjects were instructed to read their booklets 
silently and answer the sample questions on blank 
sheets of paper provided by the experimenter. 
Subjects were encouraged to proceed at a comfort- 
able rate, to reread previous pages until all was 
understood, and to be prepared for a test on the 
material. Subjects were told they would have 30 
minutes to complete the booklets. All subjects 
finished in that time. Subjects who finished in less 
time were asked to sit quietly until the test began. 

Immediately after the reading period, instruc- 
tions for the transfer test were read and a practice 
item was given. Each subject was given 15 test 
items, each typed on an index card. Cards were 
presented, one at & time, and 90 seconds were 
allowed for a subject to answer each item using an 
answer sheet provided by the experimenter. Sub- 
jects were explicitly told to write the words no 
answer if they felt a question was unanswerable. 
Subjects were permitted to refer to their teaching 
booklets during the test, but once a new card was 
presented, a subject could not work on any other 


card. 
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Five different orders of presentation for test 
items were constructed. In each order, the three 
items in each of the five sets were presented suc- 
cessively. The orders of presenting the sets were 
determined by a 5 X 5 Latin square, and the orders 
for presenting the three kinds of question within 
each set were determined by a series of 3 X 3 Latin 
Squares. 

After the transfer test, the postexperimental 
questionnaire was administered. No subjects who 
participated in the experiment indicated previous 
familiarity with the material at this time. 


Results 


In the pretest, four subjects indicated 
previous familiarity with the binomial for- 
mula, and were therefore eliminated from 
the experiment. All subjects were judged to 
have adequate computational skill for the 
experiment. Each subject answered at least 
7 of the 10 pretest problems correctly. The 
mean scores for Group F and Group G sub- 
jects were 9.4 and 9.1, respectively. 

Transfer test performance was scored 
with each answer recorded as either correct 
or incorrect. Answers in proper form but 
unfinished or wrong because of computa- 
tional error were scored as correct. 

The results are summarized in Table 1. 
An analysis of variance was performed. As 
is clear from Table 1, there was not a sig- 
nificant difference between the overall per- 
formance of the two treatment groups. The 
interaction between instructional method 
and type of transfer items was significant 
(F = 5.58, df = 4/12, p < .025). The fact 
that Group F's transfer performance was 
substantially better than Group G’s on fa- 
miliar items, but substantially worse in 
identifying unanswerable problems and an- 
swering questions about the formula, sup- 


TABLE 1 
PROPORTION or Correct Response BY TYPE or 
ITEM AND INSTRUCTIONAL METHOD: 
Experiment I 


‘Type of item 
Instructional method 
Familiar] tans; | Unan- | Question 


Formula 
General concept 


T5 
.48 


-5T 
.40 


.43 
-63 


.43 
-83 


Note.—Main effect of method: ns; interaction 
between method and type of item: p < .025. 
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ports the idea that the cognitive structures 
acquired by the two groups differed in more 
than a simple quantitative way. 

Responses to the postexperimental ques- 
tionnaire revealed no additional subjects 
who had previous familiarity with the ma- 
terials. In their comments, subjects in 
Group G tended to complain about time 
constraints in testing and often expressed 
uncertainty as to how to use the formula, 
but they often felt they had learned some- 
thing and generally they enjoyed the exper- 
iment. Group F subjects complained that 
the teaching booklet was dull and mechani- 
cal, expressed uncertainty about what the 
symbols really meant, and expressed a sense 
of uneasiness about the experiment as if it 
were incomplete; however, several Group F 
subjects displayed an interest in the mate- 
tial by asking specific questions about the 
formula and finding minor technical points 
to be clarified. 


EXPERIMENT II 


This study was designed to extend the 
result of Experiment I. A somewhat differ- 
ent population of subjects was used, and 
groups were included that received both of 
the teaching booklets. 


Method 


Subjects and design. Thirty-two male and 32 fe- 
male students attending the University of Michigan 
summer half term served as paid subjects. The 
design was a 4 X 2 X 4 factorial with repeated 
measures. One factor was instructional method, 
with Group F and Group G as in Experiment Í, 
and two additional groups, FG and GF, who re- 
ceived both of the teaching booklets in the orders 
indicated. A second factor was sex of the subject. 
The third factor was the order in which transier 
test items were presented. The within-subject vari- 
able was the type of transfer question, with Type 
F, T, U, and Q as in Experiment I. d 

Materials. The teaching booklets, pretest, 9n 
postexperimental questionnaire were essentl A 
the same as those used in Experiment I, with 
few minor changes in wording to increase C arity: 
One set of transfer test items was selected, co! a 
ing of 12 items, including 3 items each of i 
F, T, U, and Q from the items used in Experime 
I 


Procedure. 'The procedure was the same id 
respects as that used in Experiment I, with a 
obvious exceptions that test orders were arang 
using a 4 X 4 Latin square (rather than the bid in 
Latin square of Experiment I), and Subjec 


| 
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Groups FG and GF were given both teaching 
booklets and a time limit of 60 minutes to go 


through them. 


Results 

In the pretest, no subjects indicated fa- 
miliarity with the binomial formula. Two 
subjects were eliminated from the experi- 
ment because they answered fewer than 5 of 
the 10 pretest problems correctly. Mean 
scores for male and female subjects who 
were included in the study were 9.3 and 8.4, 
respectively. 

Performance on transfer was scored as in 
Experiment I. The results are summarized 
in Table 2. Regarding Groups F and G, the 
general trends obtained in Experiment I 
were found again, and this was indicated by 
a significant interaction between instruc- 
tional method and type of transfer item (F 
= 18.0, df = 3/41, p < .001). 

The female subjects in Groups GF and 
FG seem to have performed about as well 
as those in Group F; apparently for these 
subjects, the teaching that emphasized gen- 
eral concepts led to the achievement of 
some of the knowledge gained by subjects 
who had the formula emphasis, but added 
little or nothing that was not included in 
the outcome of the teaching that empha- 
sized use of the formula. Note that unlike 
the female subjects who were in Experiment 
T, female subjects in Group G in this experi- 
Ment did not do noticeably better on Type 

and Type Q problems than their counter- 
Parts in Group F. 
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On the other hand, the male subjects in 
Group G performed better on Type U and 
especially on Type Q problems than their 
counterparts in Group F. Furthermore, 
male subjects in both groups receiving the 
two teaching booklets appear to have sur- 
passed the subjects in Group F in perform- 
ance on Type U and Type Q items. Appar- 
ently for the male subjects in Experiment 
II, the two methods of instruction did lead 
to qualitatively different cognitive out- 
comes in the sense that different aspects of 
knowledge or skill were acquired in the two 
procedures. These apparent differences be- 
tween the males and females in this experi- 
ment resulted in a significant interaction 
between type of transfer item and sex (F = 
4.71, df = 3/108, p < .005) and a signifi- 
cant three-way interaction between instruc- 
tional method, type of transfer item, and 
sex (F = 2.93, df = 9/108, p < .005). 

Significant marginal effects were obtained 
due to instructional method (F = 2.48, df 
= 3/32, p < .05), type of transfer item (F 
= 12337, df = 3/108, p < .001), and sex (F 
= 11.47, df = 1/32, p < .005). There was 
also one significant effect involving the 
order of presenting transfer items; the 
four-way interaction of instructional 
method, type of item, sex, and order of 
presentation (F = 2.60, df = 27/108, p < 
.001). 

A subjects (one male and seven fe- 
males) gave responses on the postexperi- 
mental questionnaire indicating that they 


TABLE 2 
PROPORTION or Correct RESPONSE BY TYPE or ITEM AND INSTRUCTIONAL 
Mersop: Expuriment II 


ashe Instructional method 
—— cl 
M Formula 
ale General concept 
Male Formula, then general 
Vals General, then formula 
p ile Formula 
rae General concept 
emale Formula, then general 
.,, “emale General, then formula 


Type of item 
Familiar | Transformed | Unanswerable| Question 

71 Dl .58 
3 a 79 E 
.07 54 +96 Aral 
.88 .67 .88 71 
E .46 .46 .58 
.29 .88 „54 .54 
.67 .83 .50 .50 
-15 .98 .50 .50 


ype of item: p « .001; 


Note.—Main effect of method: p < .05; interaction between method and ME OSAMA 


Tain effect of sex; P < .005; interaction between sex and type of item: p < 


Lo 
Method, and type of item: p < .005. 
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had not followed instructions or learned the 
material. Their data were eliminated from 
the experimental results and other subjects 
were run in their places. Generally, com- 
ments by subjects in Groups F and G were 
similar to those described above for Experi- 
ment I. Most subjects in Groups FG and 
GF described the difference between the 
booklets quite accurately. Most male sub- 
jects responded about equally favorably to 
the two booklets. However, female subjects 
tended to regard the concept booklet— 
whether it was given first or second—as & 
complementary review of the formula book- 
let, which they regarded as being more im- 
portant. 


ExrzERIMENT III 


The main result of Experiments I and II 
was the evidence that the same concept, 
taught by different methods, can produce 
learning outcomes that differ in a way that 
is more complex than a simple quantitative 
difference. This experiment tested the gen- 
erality of the main finding, using subjects 
similar to those of Experiment I, but using 
teaching booklets that were modified to 
structure the subjects’ behavior during the 
learning period more strictly. Additional 
sample questions were added to both teach- 
ing booklets, and a subject was instructed 
that if she made an error on a question, she 
was to reread the appropriate page on the 
booklet and try an additional sample ques- 
tion. 


Method 


Subjects and design. The subjects were 40 fe- 
male students recruited from the University of 
Michigan psychology subject pool during the 
winter term. The design was the same as in Ex- 
periment II, except that only female subjects 
were included. 

Materials. The two teaching booklets used in 
Experiments I and II were modified by the 
addition of three pages following each page of 
instruction. Each new page gave the answer to 
the question on the preceding page and an addi- 
tional question. The subject was instructed that if 
her answer to the preceding question was correct, 
she should go on to the next instructional page, 
but if she was wrong, she should go back to the 
previous instructional page, reread it, and then try 
the new question. In addition, an answer sheet was 
prepared to be used with each teaching booklet, 
containing a specific place for the answer to each 
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question. (In Experiments I and II, subjects wrote 
meten on a blank sheet of paper—if they wanted 
to. 

The pretest, transfer test items, and postexperi- 
mental questionnaire were the same as those used 
in Experiment I. 

Procedure. The procedure was the same as in 
Experiment II, except that subjects were run in 
groups of two, instructions were given about the 
questions in the teaching booklets, and the 
experimenter presented the transfer test items to 
each subject as soon as she finished her teaching 
booklet or booklets. 


Results 


In the pretest, no subject indicated pre- 
vious familiarity with the binomial for- 
mula, and each subject answered at least 7 
of the 10 problems correctly. Mean scores 
for subjects in Groups G, F, GF, and FG 
were 9.3, 9.2, 9.1, and 9.0, respectively. 

Performance on the transfer test items is 
summarized in Table 3. There is a hint of 
the interaction between item type and in- 
structional method obtained in the first ex- 
periments, but the interaction was far short 
of being significant (F = 1.00, df = 9/56, 
> .20). The only significant effect was an 
interaction between the type and the spe- 
cific set of transfer items (F = 5.95, df = 
3/56, p < .05), and this can probably be 
attributed to the fact that in carrying out 
17 orthogonal F tests, one could easily be 
significant at the .05 level by chance. 

Comments on the postexperimental ques- 
tionnaire were similar to those obtained in 
the other experiments except that comments 
expressing uncertainty about the expeti 


TABLE 3 ; 
PROPORTION or Correct RESPONSE BY Tres 0 
Trem AND INSTRUCTIONAL METHOD: 


Experment III 
POLARS ar sce e| I 
‘Type of item 


Instructional method vaults 
Familiar! grans, | syerable| Q" 


Formula 67 | .57 | -60 ed 


General concept .45 .50 AT 


Formula, then 

general .63 .97 -50 v 
General, then 

formula .50 .58 -63 He 


: jon 
Note.— Main effect of method: ns; interac 


between method and type of item: ns. 
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ment were less frequent. Subjects in Condi- 
tions FG and GF indicated that the two 
booklets seemed to involve different ap- 
proaches, but there was no systematic pref- 
erence for one method or the other across 


groups. 


Supplementary Study 


It seems likely that the lack of significant 
effects in Experiment III was due to the 
change in teaching method involved in add- 
ing the sample questions and focusing the 
subjects’ attention on learning to answer 
the questions. However, a control group 
using the earlier materials was not included, 
so this inference is to some extent gratui- 
tous. In order to reduce the uncertainty on 
this matter, a small number of subjects 
were recruited from the same subject pool 
as was used for Experiment III and run 
using the booklets that omitted the extra 
sample questions. Eight subjects were run 
in the study, with four subjects each in 
Groups F and G. Transfer was tested using 
one set of items including three each of 
Types F, T, U, and Q. The procedure was 
like that of Experiments I and II. For 
Group F, the proportions of correct re- 
sponse for the four types of transfer item 
Were 1.00, .67, .67, and .50, respectively. For 
Group G, the corresponding data were .50, 
33, 67, and .92. These results are similar to 
those of Experiment I and the male subjects 
of Experiment II, and the apparent interac- 
tion was reliable (F = 6.23, df = 3/6, p < 
ee despite the small number of subjects 


GENERAL DISCUSSION 


The experiments reported here appar- 
ently have given some information about 
De kinds of differences between instruc- 
‘onal methods can produce differences in 
metas outcomes of a qualitative or struc- 
Ne 3. nature. Based on the criterion of ob- 
as Substantially better performance on 
E nd of transfer test and substantially 
k a performance on another, evidence for 
ee difference was found in Experi- 
Used in’ and in one subgroup of subjects 

One Experiment IT. 

ne explanation of the difference that 
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seems straightforward is that subjects in 
the different instructional treatments en- 
coded the information presented about the 
binomial formula in different ways. One 
reasonable hypothesis is that the booklets 
emphasizing general concepts tended to ac- 
tivate structures in the subjects’ previous 
knowledge involving concepts familiar to 
them in general experience, while the book- 
lets emphasizing the formula tended to acti- 
vate structures involving the ideas and 
techniques associated with arithmetic and 
mathematical calculations. Such a differ- 
ence as this produced in the subjects' active 
cognitive structures during learning would 
explain & considerable difference in the out- 
comes of learning under the two conditions. 
For subjects who received the formula em- 
phasis, the new ideas would be assimilated 
to schemas involving calculational tech- 
niques, while for subjects receiving empha- 
sis on general concepts, the new material 
would be assimilated to ideas of a more 
general kind, involving the subjects’ experi- 
ence with random events. 

An interpretation of the difference in 
terms of the learning outcomes achieved by 
the subjects can be developed by postulat- 
ing two variables in cognitive structure. 
One is the extent to which components of a 
structure are integrated or connected with 
each other and could be called internal con- 
nectedness. The other variable is the extent 
to which the components of a structure are 
connected or related to other elements in a 
subject’s general cognitive structure and 
this could be called external connectedness. 
A pictorial image of the two ideas is 
elicited by terms like sunburst and moon- 
glow, where a sunburst is meant to produce 
an image of a structure with strong external 
connectedness—strong connections to ideas 
outside of the structure—and a moonglow is 
meant to produce an image of strong inter- 
nal connectedness. s 

To illustrate these ideas, consider the 
component variables p and r. Internally, p 
and r are connected by the operation of ex- 
ponentiation; p is some quantity that is 
raised to the rth power. If this connection 
is strong in the subjects cognitive structure, 
then a subject can carry out the operation 
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of exponentiation easily and without hesita- 
tion. He may earry it out sometimes when 
itis inappropriate for him to do so, as in an 
unanswerable question with p — 3/2. 

The external connections involving p and 
r involve relationships that a subject real- 
izes between these variables and other con- 
cepts. For example, p may be understood as 
the proportion of possibilities that are 
counted as successes. This would connect p 
to some fairly abstract concepts. Or p might 
be connected to more concrete concepts such 
as the chance that heads will come up when 
a coin is tossed. External connections for r 
might involve the number of successes in a 
sequence of trials, or the number of times 
heads comes up in some tosses of a coin. 

Tn relation to the present experiments, it 
seems likely that the instruction for Group 
F emphasizing a formula tended to produce 
cognitive structures with strong internal 
connectedness—moonglow structures—while 
instruction for Group G emphasizing 
general concepts tended to produce cogni- 
tive structures with strong external connect- 
edness—sunburst structures. The hypothesis 
about Group F is based on the idea that 
internal connectedness mainly involves 
arithmetic operations like exponentiation 
and multiplication, and that emphasis on 
computation in the instruction made it 
likely that a subject assimilated the bi- 
nomial concept to existing structures in- 
volving arithmetic calculation. The hypoth- 
esis about Group G is based on the idea of 
external connectedness as assimilation to 
concepts in the subject’s cognitive structure, 
a process that would be likely because of 
the emphasis on the meanings of individual 
variables in Group G's instruction. 

The idea of a difference in connectedness 
is consistent with the results of the transfer 
test. Group F subjects showed superior per- 
formance on Type F items, which are 
solved by straightforward use of the calcu- 
lating rules in the formula, and on Type T 
items, which also are primarily calcula- 
tional exercises. On the other hand, Group 
G subjects excelled on items of Type U and 
Type Q, which required a subject to inter- 

pret the formula in new ways. The differ- 
ence was clearest of course in Experiment I 
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and in the results for male subjects in Ex. 
periment II. For this latter group, the use of 
both instructional methods apparently re- 
sulted in a combination of the outcomes 
that the two methods produced when they 
were given separately. If we interpret the 
difference as one involving the kind of con- 
nectedness developed in the structure, the 
combination of both teaching methods ap- 


parently gave a structure that had both in- | 


ternal and external connectedness. 

These results support the contention that 
different methods of teaching a concept 
may differ in ways that are more compli- 
cated than merely leading to greater or 
lesser learning of the concept. It is not a 
new idea that one can be taught to under- 
stand what he is doing, rather than to get 
the right answer. The method of teaching 
that allows a subject to be good at getting 
the right answer seems straightforward, as 
is its outcome in a subject’s performance. 
There is an aspect of understanding what 
one is doing that may not be as obvious. 
The subjects who benefited from training 
with general concepts showed their best pet- 
formance, relative to other subjects, in an- 
swering questions about the concept they 
had learned rather than in using the con- 
cept to solve problems. This suggests that 
an important outcome of understanding, in 
the sense of integration with the general 
content of a subject’s cognitive structure, 
may be the ability to deal with the proper- 
ties of a structure in a general way—to Ie 
port properties of the structure or to recog: 
nize a situation does not satisfy constraints 
in the structure. 

The results also indicate something about 
the limits on the possibility of using teach- 
ing methods to accomplish different goals. 
The female subjects in Experiment II d! 
not show the same kind of interaction be 
tween method and transfer performance m 
did female subjects in Experiment I or Dd 
subjects in Experiment IT. The transfer per 
formance of female subjects in Experimen! 
II who had general emphasis training oe 
gests that they acquired structures W! 
about as much external connectedness ii 
considerably less internal connectedne f 
than subjects in other conditions. While ™ 
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formation that would allow a firm interpre- 
tation of this is not available, it seems 
plausible that the kind of structure to 
which subjects in other experiments assimi- 
lated this training may not have been in 
these subjects’ cognitive structures, or if it 
was there it may not have been activated 
by the particular teaching methods that 
were used. 

Another kind of limitation is indicated by 
| the results of Experiment III, in which 
nearly all of the differential effect of the 
teaching methods observed in the other ex- 
periments was eliminated, apparently by 
the use of additional sample problems. By 
having subjects in both groups focus their 
attention on finding the answers to specific 
questions, apparently there was such simi- 
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larity in their mathemagenie behavior 
(Rothkopf, 1970) that the differences in se- 
quence of presentation failed to produce a 
substantial difference in learning outcome. 
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FACTORS IN VICARIOUS MODIFICATION OF COMPLEX 


GRAMMATICAL PARAMETERS: 


TED L. ROSENTHAL’ ax» WAYNE R. CARROLL 


University of Arizona 


Strong attention-focusing instructions surpassed weaker instructions 
for both grammatical parameters, Offering & substantial incentive 
failed to surpass a no-incentive variation. Presenting instructions and 
incentive information prior to, versus after, the model's demonstration 
failed to influence the results. Boys outperformed girls on both re- 
Sponse measures. No significant interactions among variables were 
found. 


Bernstein (1970) views differences in ver- 
bal usage between lower- and middle-class 
children as reflecting a “restricted” rather 
than an “elaborated” code of speech. The 
lower-class child’s restricted code includes a 
small vocabulary, a tendency to use con- 
crete words, and a narrow range of syntac- 
tie forms (e.g. little use of subordinating 
clauses), whereas an elaborated code in- 
volves more complex and varied syntactic 
forms among which the middle-class child 
can more readily switch with changes in the 
social situation. Hess and Shipman (1965) 
found that lower-class children were indeed 
inferior to middle-class youngsters in sen- 
tence complexity and other measures of lin- 
guistic sophistication; it is thus of interest 
to investigate factors that may amplify the 
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linguistic sophistication of economically dis- 
advantaged youngsters. 

A number of recent experiments have re- 
vealed powerful observationally induced ef- 
fects on children's production of abstract 
rule-governed responses. Rosenthal, Zim- 
Merman, and Durning (1970) found that 
separate groups of children could rapidly 
adopt a model’s diverse styles of asking 
questions and then, without further tute- 
lage, generalize the several styles to new 
stimulus instances. In the language domain, 
Bandura and Harris (1966) found that 
model’s performance could increase chil- 
dren's use of passive and prepositional con- 
structions; this result was subsequently 1ep- 
licated by Odom, Liebert, and Hill (1968). 
Carroll, Rosenthal, and Brysh (1969) dem- 
onstrated model-induced changes in young 
children’s production of simple verb psi 
and a kernel sentence pattern; these 5i 
findings were replicated by Rosenthal P 
Whitebook (1970), who also found Er i 
the offer of a small (10¢) reward for Prai 
performance given with weaker instruos 
created imitative language effects, H 
greater than attention-focusing inet e 
given without promise of any extrinsic s 
centive. All the foregoing experiments Su 
led each child individually, and use! oH 
spoken utterances as the response measu, 
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VICARIOUS MODIFICATION OF GRAMMATICAL PARAMETERS 


thus imposing conditions uncharacteristic of 
classroom teaching. 

‘Analogous to a regular composition les- 
son, the present procedures studied children 
in groups who wrote their responses and 
used stimuli chosen from mimeographed 
exicons, and also presented on a chalk- 
board by the model. The authors sought to 
explore variables that might influence the 
strength of vicariously induced adoption of 
sophisticated grammatical constructions 
ie. the complex sentence and the past per- 
fect tense) by economically disadvantaged 
youngsters from mainly Mexican—American 
ackgrounds. Strong attention-constraining 
instructions were compared to weaker ones, 
and the offer of considerable reward for 
good performance was compared with the 
omission of the offer. Current theorizing 
eg, Bandura, 1969; Postman, 1968) em- 
phasizes the distinction between learning 
and performance. Because instructions of 
different strength, and promise of reward or 
its omission, might plausibly influence stor- 
age of information if given before the mod- 
el’s demonstration, but might only influence 
retrieval if given after the model's demon- 
stration, the sequence of presenting instruc- 
tional and incentive treatments was system- 
tically varied. Thus, the treatments ad- 
ministered compared sex of child, strong 
versus weak instructions, incentive versus 
no incentive offered, and presentation of in- 
structions and incentive information before 
versus after the model's demonstration in a 
2X2 x 2 x 2 factorial design. 


METHOD 
Subjects 


loon cm, eight seventh-grade classes at a school 
of Tu in an economically disadvantaged region 
Masse groups of 10 boys and 10 girls were 
BOR by randomly drawing names. One male 
m din female group were next randomly assigned 
pen i combination of experimental (or control) 
te ane In all, 90 boys and 90 girls (ranging 
to 14 years old) served as subjects. 


Procedure 


to inch group of 10 like-sexed children was brought 
fy Dus TOOm, seated at 10 desks positioned in 
Miei before a chalkboard, and provided with 
alphab * sheets and mimeographed word lists 

etically organized into classes of articles, 
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connectives, nouns, pronouns, and verbs as shown 
in Table 1. One writer always took the role of ex- 
perimenter, and the other, the role of model; both 
adults were, in appearance, Anglo-Americans in 
their mid-thirties with no striking departures 
from average characteristics. In base line, the ex- 
perimenter called attention to the format of the 
word list, noted that any word could be used as 
often as the child wished, and directed the chil- 
dren to “make up 12 sentences using the words on 
the list,” and to print their answers neatly. In both 
base-line and imitation phases, 11 minutes’ writ- 
ing time was allowed during which the experi- 
menter moved about as a proctor; after 11 min- 
utes he collected all word lists and response sheets, 
whether complete or not. 

Next, he introduced all experimental variations 
by saying, “This man (the model) is going to make 
up sentences and print them on the board...,” 
and gave further directions before or after sentence 
modeling according to the treatment combination 
described below. The model always produced the 
same 12 sentences in the same order, speaking 
each aloud before erasing and printing the next. 
All sentences followed the same pattern as il- 
lustrated by the first two examples presented: “1. 
After the priest had married the couple, the groom 
kissed the bride. 2. Because the team had defeated 
the rivals, the fans celebrated the victory.” When 
the model had erased his final sentence, the ex- 
perimenter completed treatment directions, distri- 
buted word lists and new answer sheets, and 
proctored while the children completed their imita- 
tion protocols. Then subjects’ promises not to dis- 
cuss the research with other students were ob- 
tained, and the children were thanked and returned 
to their several classes. Modeling was omitted for 
control subjects who completed a second response 
sheet directly after base line as a check on “spon- 
taneous” sources of response change. To assure 
comparability among groups, all data collection 
was accomplished in 7 successive school days. 


Treatment Variations 


The strong instructions called attention to the 
model’s performance, noting that his sentences 
had two parts and involved past action, and en- 
joining the children to attempt to reproduce his 
sentences as follows: 

We are interested in what you (are/were) able 

to learn from watching the man. All his sentences 

(have/had) two parts and (tell/told) about 

what has already happened. See if you can give 

the same sentences in the same way the man 

(does/did). Try to remember all you can from 

watching and listening to Mr. Rosenthal. 


The quoted passage was omitted for weak instruc- 


i ups. f 
Vos “with-incentive” condition was informed: 
We want you to do & good job. We will be work- 
ing with a few other groups in your school. To 
reward people for doing a good job, we will give 
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TABLE 1 
Worn List GIVEN To ALL CHILDREN 
Class Words 

Articles a 

» an. 
the 

Connectives |above over 
after Since 
because under 
for until 
in up 
on when 

Nouns animals cowboys groom pilot runner team 
audience crew herd priest sheriff tire 
ball driver hero rabbit signal trainer 
batter fans magician ranch snow tricks 
bride fire medal rangers steak victory 
cattle flowers moon rivals sun wand 
cook foreman nail river surface wheel 
couple gardener outlaws rocket survivors whip 

Pronouns he they 
it we 
she you 

Verbs am changes has misses prepared rescues 
are crack have perform prepares start 
be cracked is performed raid started 
brand cracks kiss performs raided starts 
branded defeat kissed phone raids surprise 
brands defeated kisses phoned reach surprised 
bunt defeats marry phones reached surprises 
bunted explore married pierce reaches was 
bunts explored marries pierced receive wave 
celebrate explores melt pierces received waved 
celebrated flood melted plant receives waves 
celebrates ^ flooded melts planted rescue were 
change floods miss plants rescued will 
changed had missed prepare 


Note.—All words are in alphabetical order within each group. 


$20 for a party to the group that, in our opinion, 
does the best job. We have arranged with the 
Principal that the best group will be allowed to 
have a party together, and we will supply $20 
for that party. 


The quoted passage was omitted for “no-incentiye” 
groups, 

In the before-modeling information Sequence, 
the relevant combination of instructional and in- 
centive directions was presented just prior to the 
model’s demonstration; in the after-modeling se- 
quence, the relevant directions were presented 
immediately following his demonstration. 


Response Measures and Scoring 


Change from base-line production of both com- 
plex sentences and past perfect tense forms were 


studied. Since a child could differ in his production 
of sentences at base-line and imitation phases 
and since different children completed a 
numbers of sentences, the sentence measure Wi i 
the difference between a child's proportion of com 
plex sentences in base-line and imitation; a 
correct complex (but not compound) sentence Wi 
scored, whether modeled or not. The tense mea 
was the difference in proportions between E A 
and imitation sentences containing the pasi ut 
fect tense correctly used (in the rare cases of en 
ple pluperfect verbs used within a single es 
only one pluperfect was given credit). The i 
were scored by both authors who, for each que 
agreed in over 98% of responses; the few mous 
of interscorer disagreement were resolved bY 
cussion. 
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TABLE 2 
MzAN BasE-LiNE PROPORTIONS AND MEAN 
PROPORTION INCREASES AT IMITATION 
FOR EXPERIMENTAL TREATMENTS ON 
Past PERFECT TENSE AND 
COMPLEX SENTENCES 


Past perfect tense | Complex sentences 
Group E RT UAR ~ 
Base line pes Base line | Imitation 
Instructions 
Strong .008 | .173 .018 .582 
Weak .016 | .078 035 .823 
Incentive 
Offered .014 | .135 .021 .494 
Not offered .004 | .111 .032 .412 
Sequence 
Before .013 | .123 .019 A417 
After .006 | .124 .034 .489 
Sex 
Boys .004 | .166 .016 .532 
Girls .014 | .081 .036 874 
RESULTS 


The base-line mean proportions of com- 
plex sentences and past perfect verbs were 
compared among the 18 experimental and 
control groups; neither measure revealed 
significant? base-line differences (largest F 
= 1.26, df = 17/162, ns). Next, the com- 
bined experimental and control groups were 
compared for mean proportion changes 
from base line to imitation. For the pluper- 
fect tense measure, the pooled experimen- 
tals’ mean proportion (.123) surpassed the 
Controls’ (.002) to a significant extent (F = 
6.08, df = 1/178, p < .02). For the complex 
sentences measure, the experimentals’ mean 
proportion (.453) surpassed the controls’ 
bo to a significant extent (F — 27.42, 
v - 1/178, p < 0001). Thus, provision of 
i e model s demonstration created substan- 
Bs increases over the scores of no-model 
aol subjects, for both grammatical pa- 
M om Studied. For each experimental 
Pu nag the mean base-line proportions 
m the mean increases from base line to 

lation are presented for both dependent 
Measures in Table 2. 
ne dr 2 x 2 x 2 x 2 analysis of 
at of pluperfect tense responses re- 


* Al signifi in thi are 
e cance levels reported in this paper 
ed on two-tailed probability estimates. 
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vealed a significant effect for instructions, 
with the strong instructions creating greater 
imitative learning (F = 8.67, df = 1/144, p 
< .01), and for sex, with the boys scoring 
higher (F = 6.33, df = 1/144, p < .02). No 
other main effect or interaction approached 
significance; indeed, all other F values 
equaled less than 2.0, except for the In- 
structions X Incentive X Presentation-Se- 
quence interaction term (F = 2.95, df = 
1/144, p > .08). 

The major analysis of the complex sen- 
tence measure once more revealed a signifi- 
cant effect for instructions, with strong- sur- 
passing weak-instructions groups (F = 
19.86, df = 1/144, p < .001), and for sex, 
with the boys again scoring higher (F = 
7.33, df = 1/144, p < .01). No other main 
effect or interaction term attained or ap- 
proached statistical significance. 


Discussion 


The results showed that sophisticated 
language forms could be vicariously elicited 
from economically disadvantaged children 
under classroomlike conditions. The failure 
of a substantial incentive to enhance lan- 
guage learning suggests that when a school- 
like task is presented, most normal children 
will strive to perform well with or without 
extrinsic reward (Rosenthal & Whitebook, 
1970). However, tedious tasks and actions 
that violate social norms or invite aversive 
consequences may require reward if obser- 
vationally learned responses are to be 
overtly performed (Bandura, 1965, 1969). 
The present superiority of strong atten- 
tion-focusing instructions emphasizes the 
value of orienting the child to information- 
transmitting procedures; such orientation 
may especially aid youngsters whose “re- 
stricted” repertoires may include deficits in 
knowing how to approach the educational 
content provided. f 

The creation of comparable learning from 
information presented before or after mod- 
eling is consistent with Bandura's (1969) 
theory of vicarious acquisition processes 
which assumes that observational exposure 
“ean produce relatively enduring, retrieva- 
ble images of modeled sequences [p. 133]." 
Although the stronger and weaker instruc- 
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tional inputs produced differential imita- 
tion, the children's capacity to encode, 
maintain, and retrieve data was robust 
enough to transcend the positioning of in- 
structions, but still to reflect differences be- 
tween stronger and weaker instructional in- 
puts. Viearious teaching methods display 
rule-governed paradigms as “chained” ar- 
rays, rather than as discrete components in 
piecemeal fashion. Therefore, if further re- 
search confirms rapid observational trans- 
mission of complexly organized material, 
vicarious techniques may provide a valua- 
ble shortcut for instituting stable mnemonic 
representations of “elaborated” verbal and 
conceptual codes. 
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Norms for the new CMMS are based upon 


testing, since the child takes only the items 
most suitable for children of his chronological 
age. Two derived age-based scores—the Age 
Deviation Score and the Maturity Index—pro- 
vide significant indicators for psychologists 
concerned with assessing individual differences 
in abstract reasoning. 

For more information about 1972 Colum- 
bia Mental Maturity Scale, write to: 


TESTDEPARIMENT — 
H Harcourt Brace Jovanovich, Inc. 
757 Third Avenue, New York, N.Y. 10017 


ANNOUNCING... 


“INTRODUCTION TO RESEARCH IN EDUCATION 
by Donald Ary, Northern Illinois University, 
,Lucy Cheser Jacobs, Indiana Universty, and 
Asghar Razavieh, Pahlavi University, Iran 

The inexperienced researcher will:find in this 

new text a comprehensive yet concise introduc- 
tion to.the purpose, philosophy, and basic tech- 
niques: of scholarly investigation in, education. 
The authors begin with a general overview of 
the scientific approach as'a means of acquiring 
;knowledge, and then proceed to examine the 
practical’ problems involved in: translating edu- 
€ational questions into statements amenable to 
scientific inquiry. 

They explore logically and clearly the roles 
of related research, hypothesis formulation, mea- 
surement, sampling, reliability and. validity re- 
search design, and the application of statistical 
solutions — precisely defining terms as they are 
introduced and pointing out which are the most 
widely used techniques, designs, and proced- 
ures. Thus, the student will be equipped to 
‘evaluate, important published .research and to 
design his own projects with a minimum of 
assistance, The authors also discuss the essen- 
tials-of writing research proposals and reports 
and the criteria for evaluating them; 
Publication: Spring 1972 416 pages, $9.00 (tent.) 


MIND AND CONTEXT IN THE ART 

OF DRAWING 

by Kenneth R. Beittel, Pennsylvania State 
University 


Based on a decade of thoroughgoing research 
experiments, this graduate-level textbook offers 
an empirical and speculative account of the 
drawing process and the drawing series and of 
the contexts in which they occur. The author 
stresses a process point of view in the form of 
a sequential analysis, of the act of drawing. Proc- 
esses are recorded by time-lapse photography, 
with the results used as a basis for analyzing 
consistent drawing strategies, as “feedback” de- 


vices for learning experiments, and as stimula- 
tive recall. for questioning an individual about 
the construction of his work.- = 

Emphasis on the drawing series leads.to in- 
sights on the subtle but lawful development of 
drawings, one out of another) over a period of 
time. Further: insight into the process itself ‘is 
gained by the author's attempts. to synthesize 
literature from psychology (about thinking, learn- 
ing, and problem-solving) and philosophy (aes- 
thetics), as well as critiques of narrow behavior- 
ism, and arguments for personal aspects of 
knowing. Questions related to theory and method 
in research are treated in the: latter chapters of 
the book, along with recommendations for the 
development of-a vital psychology of: art. 
Publication: Spring 1972. 288 pages, $10.00 (tent.) 


THE REGENERATION OF THE SCHOOLS 
Readings for Educational Psychology, 
Sociology, and Politics 
Edited by John P. DeCecco, San Francisco State 
College deine. 

Piaget versus Jensen and Skinner; packaged 
programs and computer instruction versus stu- 
dent selection of materials ahd methods of in- 
struction; student participation versus student 
revolution — these are but a sampling’ of the 
antithetical. issues and' viewpoints. brought to- 
gether in this new collection. The'editor, a lead- 
ing educational psychologist, believés that only 
through ‘genuine reform-in our schools\can we 
effect changes in the behavior of our students. 
The articles he has selected for this book con- 
stitute an eloquent case for the redirection of, 
work in psychology, sociology, and politics 'to- 
ward educational reform. 3 

Written by students as well; as. scholars, the 
readings delve deeply into the ‘controversies, 
conflicts, and failures: of today's schools, and 
focus on subjects seldom covered by other texts 
in the field — moral development, adolescent 
discontent, ‘curriculum packages, testing, nego- 
tiation of classroom conflicts. Chapter introduc- 
tions serve as provocative points of departure 
for the readings that follow. 
Publication: January 1972 
576 pages, $7.00 paper 


LEARNING ENVIRONMENTS: Readings 

in Educational Psychology 

Edited by William J. Gnagey, 

Patricia A. Chesebro, and James J. Johnson, 
all of Illinois State University 


This book of readings supplements the under- 
graduate course in education. The readings ex- 
amine. such topics as what teaching is like, 
development apart from school, measuring prom- 
ise and: attainment, motivation and practice, 
character and. socialization in the schools, the 
teacher's adjustment, and the class as a social 
group. An.especially strong section on the psy- 
chology of classroom discipline is included. 

Three features are unique to Learning En- 
vironments. The first is the use of focus questions 
before each article to alert the student to salient 
points in the reading ahead. The second is the 
feedback quiz, patterned after Pressey's concept 
of the “teach test,” which follows each article. 
It consists of quick multiple choice questions 
related to the preceding material. Answer keys 
provide immediate reinforcement. The third fea- 
ture is the transfer problem (or problems) follow- 
ing each reading, which encourages the student 
to consider the broader implications of what he 
has just read. The articles are correlated by 
chart with specific chapters in the major texts. 
Publication: January 1972 
416 pages, $6.00 paper 


PSYCHOLOGICAL FOUNDATIONS OF 

EDUCATION, Second Edition 

By Morris E. Eson, State University of New York, 
any 


Like the successful first edition, this text inte- 
grates the several schools of psychology — par- 
ticularly S-R learning theory, cognitive theory, 
and the phenomenological position—and demon- 
strates how these illuminate the different kinds 
of practical issues in education. The author has 
thoroughly. revised each chapter to incorporate 
new material, including an up-to-date discussion 
of Piaget's theory of cognitive development and 
its classroom implications; considerable empha- 
Sis on the teaching of values and the importance 


of the teacher-pupil relationship; a review of the 
research on educational intervention; frequent 
references to the field of language acquisition; 
a new interpretation of the IQ.controversy; and 
an assessment of the contribution of educational 
technology. 

Summaries and annotated lists of supplemen- 
tary reading are features of each chapter of the 
book. A Workbook-Study Guide containing a pro- 
grammed presentation of each chapter, self-test 
items, and suggestions for student projects will 
accompany the text. And, an /nstrucior’s Manual 
with short. questions for each chapter will also 
be available. 

Publication: January 1972 
608 pages, $10.00 


now available... 


ENCOUNTERS WITH THE SELF 
by Don E. Hamachek, Michigan State University 
1971. 288 pages $5.00 paper 


FUNDAMENTALS OF CHILD DEVELOPMENT 
by Harry Munsinger, University of California, 
San Diego 

1971 376 pages $11.50 


READINGS IN CHILD DEVELOPMENT 
Edited by Harry Munsinger 
1971 425 pages $7.00 paper 


READINGS IN EDUCATIONAL PSYCHOLOGY 
Edited by George J. Mouly, University of Miami 
1971 576 pages $5.95 paper 


SUCCESS IN THE CLASSROOM 

An Approach to Instruction i 4 
by Marie G. Hackett, Florida State University 
1971 128 pages $3.00 paper 


Please write to Marie N. Mastorakis, EP4, 
College Promotion, for more information. 
HOLT, RINEHART AND WINSTON, INC. 
383 Madison Avenue, New York, New York 10017 
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Sychologicaj 
Foundations 
9! Education 


Harry Munsinger 


THE PSYCHOLOGY OF 
SECOND LANGUAGE 
LEARNING 


Edited by PAUL PIMSLEUR 
and TERENCE QUINN 


Nineteen papers from the Second Inter- 
national Congress of Applied Linguistics, 
focusing on the various psychological 
approaches which now characterize 


AN INTRODUCTION 
TO THE PSYCHOLOGY 
OF RELIGION 


Third Edition 
ROBERT H. THOULESS 


ies, and places greater emphasis than did 
the earlier editions on non-Christian 
religious systems and problems of mutual 


toleration. 
Cloth $7.50 Paper $2.75 


SES Cambridge 

eve! University Press 

e 32 East 57th Street, 
New York, N. Y. 10022 


TWO GOOD 
REASONS 


TO REVIEW 
THE LEARNING 
PROCESS 


New THIRD Edition! 

PSYCHOLOGY OF LEARNING AND TEACHING 

Harold W. Bernard, Oregon State System of Higher Education. 1972, 448 pages, $8.95, 
Instructor's Manual available. A rapidly changing society necessitates the constant 
updating of teaching methods and attitudes toward education. The third edition of 
this text emphasizes the various elements influencing a student's ability to learn, 
prominent among these are cultural setting and ego concept. How a student feels 
about himself has a profound effect upon the learning process, as do the outside 
pressures that help to form his self-concept. An educator, if he is to teach and learn 
himself, needs to understand these influences. The scope of the new edition includes 
elementary through adolescent pupils, individuals and groups, gifted and slow pupils, 
and the culturally different. 
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New FIFTH Edition! 

CHILD DEVELOPMENT [ 
Elizabeth B, Hurlock, formerly Lecturer, Graduate School of Education, University of 
Pennsylvania. 1972, 384 pages, $10.95, Instructor's Manual available. Covers the de- 
velopment of the child from conception to puberty in a format that has proved so 
successful in previous editions: coverage by developmental categories rather than 
by age levels. Using an approach which is both scientific and practical, the book 
reflects the findings of the latest studies in the field and also shows how foam 
knowledge can be applied in many practical situations. The extensive citations ani 
bibliographies of the fifth edition will maintain “Child Development's” reputation as 
perhaps the best and most complete sourcebook and reference guide in the field, for 
both students and teachers. 


r 
McGRAW-HILL BOOK COMPANY TES: mi! 
Publisher of the Carnegie Commission on Higher Education Series . 
330 West 42nd Street, New York, N.Y. 10036 Hi li 


All prices subject to change. 


The Psychology 


Of Open 
An Inquiry Approach Teaching p 


Jarana G: Annee and Learning 
Jay M. Yanoff 


all of Temple University 


A new approach to the study of education, this book.allows students to personally explore 
problems in teaching and learning before they encounter them in their classroom. In contrast 
to the typical reader, the book deals with open education, both in theory and in practice. 
The Psychology of Open Teaching and Learning contains extensive activity choices and 
inquiry materials — problem simulations, role playing techniques, study and discussion 
topics, resource materials — to guide the student in his own Tui In addition, the readings 
themselves cover a wider range of issues and views than other books on open education. 
The selections, from books rather than journals, serve to stimulate student interest in a 
variety of books on education. Adaptable to foundations 


of education, educational psychology, and many other ivis 
courses, the book can be used for independent study College Division 
or for group programs. Challenging and functional, Itt a 

it involves the student in the teaching-learning 3 


process, giving him a broad view of its problems WI l 
as well.as an understanding of current ideas on education Bro 


Paper approx, 384 pages April 1972 tentatively $5.95 c M MM 
ji Boston, Mass. 
02106 
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or good measure... 


Educational and Psychological Measurement 


and Evaluation 
JULIAN C. STANLEY, Johns Hopkins University 
KENNETH D. HOPKINS, University of Colorado 


Representing an extensive revision and enlargement of MEASUREMENT 
IN TODAY'S SCHOOLS, by J. C. Stanley, this new text examines the 
concepts basic to the development of educational measures and stan- 
dardized texts. It provides a thorough understanding of intelligence, 
achievement, interest, personality, and sociological measures. It allows 
the student to analyze teacher-constructed tests and other measures of 
cognitive and affective domains. EDUCATIONAL AND PSYCHOLOGICAL 
MEASUREMENT AND EVALUATION summarizes research data on the 
influences of various factors—response styles, cultural . disadvantage- 
ment—on test performance. The supplementary, self-contained pro- 
grammed instructional unit on concepts necessary in test development, 
evaluation and interpretation reduces the class time devoted to statistics. 
March 1972, 528 pp., cloth (23628-1) 


Perspectives in Educational and Psychological 


Measurement 

GLENN H. BRACHT, University of Minnesota 
KENNETH D. HOPKINS, University of Colorado 
JULIAN C. STANLEY, Johns Hopkins University 


New collection of readings—discusses and analyzes the development of 
tests with the aid of rich empirical studies that illustrate basic concepts. 
Treating reeent developments in measurement and evaluation, the book 
explores the ethics of collecting student data and maintaining records. 
Good companion readings selection—it supplements EDUCATIONAL AND 
PSYCHOLOGICAL MEASUREMENT AND EVALUATION. Topics covered 
National Assessment of Educational Process; 
t clarification of test validation; readings on 
id marketing systems by Popham 


include: description of the 
Professor Cronbach's recen: 


criterion-referenced measurement an s by 
and Husek; Ebel and Thorndicke; analysis of the nature of intelligence by 


Professor Jensen and culture-fair testing by Professor Anastasi. February 
1972, 384 pp., paper (66090-2); cloth (66091-0) 


Essentials of Educational Measurement 
ROBERT L. EBEL, Michigan State University 


EXCELLENT—the critically acclaimed MEASURING EDUCATIONAL 


ACHIEVEMENT is now available in an extensively revised and expanded 
new edition. The history and philosophy of measurement and recently 
published tests of achievement and intelligence are all dealt with in 
entirely new sections. Offers systematic and procedural suggestions for 
classroom test planning, essay testing, and the writing of true-false test 
items. Professor Ebel includes end of chapter summaries, and a 


i t is also available— 

glossary of technical terms. A TEACHER'S MANUAL is also abbr 
ini tions to problems, guides for project evaluation, and ov 

Conn ts and multiple-choice test items. February 1972, 567 pp. 


cloth (28599-9) 


For more information write Box 500 


Prentice-Hall 2559 05, 


Bacon In 


New 72 

THE CAUSES OF BEHAVIOR: 
Readings in Child Development 
and Educational Psychology, 
Third Edition 

Edited by Judy F. Rosenblith, Wheaton College; 
Wesley Allinsmith, University of Cincinnati; and 
Joanna P. Williams, University of Pennsylvania, 
Philadelphia. Designed for courses in child 
development and educational psychology, this 
broad collection of recent articles covers the 
range of theory and method in Psychology 
today. An extensive Teacher's Manual accom- 
Panies the text. 1972 Paperbound Est, 654 pp. 


A SYMPATHETIC UNDERSTAND- 


ING OF THE CHILD: Six to Sixteen 
By David Elkind, University of Rochester. Dis- 
cussing some of the major aspects of child and 
adolescent development, children are viewed 
in the context of the social relationships in which 
they live and learn. 1971 Hardbound and 
Paperbound 154 pp. ; 


PSYCHOLOGICAL AND EDUCA- 
TIONAL TESTING 

By Lewis R. Aiken, Jr, Guilford College. A 
concise sourcebook covering four aspects of 
testing: background and methodology of test- 
ing; intellective tests; nonintellective tests, and 
Contemporary issues and developments. A Stu- 
dent Guide and a Test Manual are available. 
1971 346 pp. 


CONTEMPORARY ISSUES IN 


EDUCATIONAL PSYCHOLOGY 
Edited by Harry F. Clarizio, Robert C. Craig, 
and William A. Mehrens, all of Michigan State 
University. Significant issues are classified into 
nine areas, with introductions by the editors, 
and presented in a pro and con format, A 
Teacher's Manual is available. 1970 Paper- 
bound 747 pp. 


Allyn and Bacon, Inc., Dept. 893, College 
Division, 470 Atlantic Ave., Boston, MA 02210 


A New Book for Doing! 


EDUCATIONAL 
PSYCHOLOGY 


The Instructional 
Endeavor 


MOSBY 
TIMES MIRROR 


THE C. v, M 
* V. MOSBY COMPANY 
Puy WESTLINE INDUSTRIAL DR. 
* LOUIS, MISSOURI 83141 


“Heaven help me... 
This is IT!” 


Prepare your students 
to confront the classroom! 


The future of this text is not a dusty bookshelf— 
this is a book for doing, not just reading once-over- 
lightly! Down-to-earth discussions, liberally laced 
with humor, push and pull your students into the 
exciting real world of teaching. In the author's 
own words, “This is a book about the psychology 
of teaching and learning. It is for teachers and 
it has only one purpose: to help build teaching 
skills that are psychologically sound.” 

With his intellectual feet firmly planted in 
classrooms rather than ivory towers, Dr. Charles 
helps your students develop a practical, no- 
nonsense approach to teaching. Lively, pragmatic 
examples give meat to abstract theory. The author 
vividly crystallizes the basic principles of learning 
in an easy-reference list and provides a valid, 
intriguing model of teaching. Your students will 
especially appreciate the concrete examples that 
demonstrate ways to motivate learning. Analyzing 
verbal interaction, formulating questions that 
trigger cognitive processes, and planning instruc- 
tion are only a few of the topics that can foster 
creative teaching! 

And, as the material reflects today's action- 
oriented student, the design of the text echoes 
Dr. Charles sparkling commentary. Format, 
type styles, and graphics maintain student in- 
terest while they visually stress important data. 

Bolster your students’ enthusiasm and knowledge 
—prepare them to face the classroom with con- 
fidence. Put this unconventional guide to work in 
your course next year! 

By C. M. CHARLES, Ph.D., Professor of Education, San Diego 


State College, San Diego, Colif. March, 1972. Approx. 448 pages, 
734" X 10", 40 illustrations. About $4.95. 
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WHAT IS A RULE? 
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. Rule-governed behavior is defined as a function involving classes of 
overt stimuli and responses in which each class of overt stimuli is 
paired with a unique class of overt responses. This definition pro- 
vides a basis for analyzing many kinds of complex behavior; con- 
ceptual- and association-governed behavior are shown to be special 
cases. Rule-governed behavior is accounted for in terms of a rule con- 
struct, defined as a triple (D, O, R), where D refers to the set of 
n-tuples of stimulus properties which determine the responses, and O, 
to the operator which maps the properties in D onto the internal re- 
sponses in R, It is argued that the distinction between rules and rule- 
governed behavior is important and should be kept in mind in formu- 


lating research. 


Knowledge has been defined as an “in- 
ferred capability which makes possible the 
successful performance of a class of tasks 
that could not be performed before learning 
+, [Gagné, 1962, p. 355].” Although most 
Psychologists would probably agree with 
this statement, there almost certainly would 
be some strong differences of opinion on 
how best to characterize the knowledge con- 
struct and thereby account for the broad 
transfer implied. 

While knowledge in this sense properly 
tefers to complex cognitive structures of all 
Sorts, the present concern is limited in scope 
to simple rules. More particularly, the pur- 
M 
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pose of this paper is to: (a) define rule-gov- 
erned behavior and show how this definition 
provides a basis for analyzing many kinds 
of complex behavior—in the process we see 
how (so-called) rote and conceptual behav- 
ior turn out to be special cases, and (b) 
introduce & rule construct to account for 
rule-governed behavior and indicate the im- 
portance of distinguishing between rules 
and rule-governed behavior in research. 


Nature or RurLE-GovERNED BEHAVIOR 


In spite of the increasing attention being 
given to rulelike processes in psychological 
theorizing, there is no commonly accepted 
definition as to just what rule-governed be- 
havior is. For example, different investiga- 
tors talk about rules for combining attri- 
butes to generate a category (Haygood & 
Bourne, 1965), for testing hypotheses in 
discrimination learning (Levine, 1966), for 
adding numbers (Scandura, 1966, 1968, 
19692, 1969b, 1970), and for naming colors 
(eg. “continuous” rules) (Scandura, 
1967a). Attention has also been given to 
executive or control rules (e.g., Reitman, 
1965) and higher order rules (Roughead & 
Scandura, 1968; Scandura, 1970) of various 
sorts. 

In fact, it has been common practice to 
use the term “rule” when referring both to 
behavior and to theoretical constructs used 
to account for behavior. The purpose of this 
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section is to make this distinction explicit 
by presenting a definition of rule-governed 
behavior and showing how this definition 
applies to a variety of situations. 

To avoid unnecessary confusion, let me 
emphasize from the beginning that I am 
using the terms overt (nominal) stimulus 
and overt response in a rather liberal fash- 
ion. By overt stimulus I mean any observa- 
ble display, even one which may take place 
over a period of time, which has been shown 
to elicit some behavior on the part of the 
subject in question. In this sense, the term 
has a meaning quite analogous to that de- 
scribed by Hocutt (1967), although it is 
used here in a somewhat broader sense than 
Hocutt might have intended. The term 
overt response refers to whatever behavior 
is of interest to the experimenter in a given 
situation. 

It is not enough, however, to say that 
rule-governed behavior involves a class of 
nominal (ie. overt) stimuli and a class of 
overt responses. As Berlyne (1965, p. 8) has 
emphasized, associations also mediate be- 
tween classes of overt stimuli and responses. 
Further clarification is needed if psycholo- 
gists who are interested in this area are to 
avoid talking past one another. 


Definition of Rule-Governed Behavior 


Although all rules make it possible to 
perform successfully on a class of tasks, the 
number of such tasks may vary greatly. If 
the possible pairings of nominal stimuli and 
overt responses are all functionally equiva- 
lent insofar as a given observer is con- 
cerned, then the behavior is typically said 
to be rote (i.e., association governed). For 
example, the nonsense syllables ZUG and 
MUR are said to be associated if there is a 
connection between a class of tokens (e.g., 
ZUG, zug, Zug,...or even ZUG [Trial 1], 
ZUG [Trial 2]...) of the nonsense syllable, 
ZUG, and a class of equivalent pronuncia- 
tions of the nonsense syllable MUR. The 
important point is that in responding to an 
arbitrary but equivalent overt stimulus, 
any one pronunciation would be equally as 
good as any other. 

According to this view, such behavior as 
“putting a cross through each circle” is also 
rote. Any overt response may be paired 


‘with any representation of the number 
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with any overt stimulus. Thus, in spite of 
the relatively large stimulus and Tesponse 
variability allowed, all of the nominal stim. 
uli and overt responses are functionally 
equivalent—just as with the nonsense sylla- 
ble pair. (In effect, what one might be 
tempted to call rule-governed behavior, in 
the general sense described below, turns out 
to be nothing more than rote behavior in 
disguise.) 

It would seem that all overt stimuli or 
responses, which are functionally equiva 
lent in this sense, necessarily must haye 
certain properties in common. We shall 
make this assumption since it is crucial to 
the present development. Thus all represen- 
tations of the nonsense syllable ZUG are 
assumed to have certain invariant proper- 
ties in common. Invariant properties, then, 
define equivalence classes of overt stimuli 
(responses), namely, those classes consist- 
ing of all and only those stimuli (responses) 
having the invariant properties. Sets of in- 
variant properties (or the equivalence 
classes they define) are called effective 
stimuli and responses. It is worth noting 
that by definition effective stimuli (te 
sponses) are not directly observable. (This 
fact takes on particular importance below 
where we see how effective stimuli may be 
defined in terms of the to-be-expected 1e 
sponses. While clearly contrary to present- 
day usage, this view seems to have a num- 
ber of important advantages in dealing with 
rule-governed behavior, and the skeptical 
reader is urged to read on and to thi 1 
about the matter before making a fina 
judgment.) 

Rule-governed behavior involves mor 
than single equivalence classes of ov? 
stimuli and responses. In general, such be 
havior involves classes of distinct classes” 
that is, a class of classes of (nominal) stim- 
uli and a class of classes of (overt) 
sponses. Furthermore, the (distinctive | 
classes of nominal stimuli are paired "d 
partieular classes of overt responses. 5] 
individual (overt) stimuli and respons : 
may not be paired in an arbitrary T 
Thus, in considering the ability je 10 
(any) two numbers, it would be possit 
respond to the token (overt stimulus) 2 
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(og, $ 8,8, eight, etc.) but not, say, with a 
name for the number 6. In short, the equiv- 
alence classes involved in rule-governed be- 
havior are behaviorally distinct. (Pairings 
between [single] effective stimuli and [sin- 
gle] effective responses are called instances; 
different instances are said to be behavior- 
ally distinct. Finally, it should be noted 
that the terms set and class are used inter- 
changeably.) 

An implicit assumption made throughout 
the foregoing discussion, one that will be 
made explicit here, is that each effective 
stimulus is paired with exactly one effective 
response. In no case was consideration given 
to the possibility that there was more than 
one response per stimulus (ie., that rule- 
governed behavior involves arbitrary rela- 
tions). In effect, rule-governed behavior is 
defined as a function (in the mathematical 
sense). 

When defined in this way, conceptual and 
tote behavior become special cases of rule- 
governed behavior (e.g., Scandura, 1968). 
Conceptual behavior is simply rule-gov- 
emed behavior in which the class of effec- 
tive responses contains exactly two ele- 
ments (i.e., each exemplar [effective stimu- 
lus] is paired with one response, and each 
Donexemplar with the other response). Rote 
behavior is further restricted so that there 
Is only one effective stimulus and one effec- 
tive response—that is, the rule-governed 
class includes exactly one stimulus-response 


Instance, 


Applications and Extension of the Notion 
of Effective Stimulus 


E gain some feeling for the extent to 
M ich rule-governed behavior is involved in 
quaningful learning, consider a few tasks 
E have little to do with computation. 
cu early, it Would take little imagination to 
ES. the addition example above to such 
b * as multiplication, taking square roots, 
nding the greatest common divisor of 

Wo integers.) 
ing tat for example, is involved in draw- 
infere ogical inference? Consider the rule of 
these” modus ponens: A implies B, A; 
pairs sy B. In this case, the stimuli are 
a of statements (i.e., entities which can 
assified as either true or false), and the 
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responses are single statements that bear a 
particular relationship to the two premises. 
The inclusiveness? of the associated classes 
of stimuli and responses, while they ideally 
would contain all pairs of statements and 
conclusions which are related logically in 
the way indicated, would in any particular 
case be tempered by the subject’s ability to 
recognize the applicability of the inference 
rule.* 

There are also many examples of rule- 
governed behavior in which the effective 
stimuli (ie., the invariant properties) are 
not immediately detectable. The following 
task in which the subject is presented with 
a row of six empty boxes on each trial pro- 
vides a good illustration. On the first trial, 
the correct response is to put an X through 
the first box; on the second, an X through 
the third box; and on the third, an X 
through the fifth box. On the fourth trial, 
the X goes in the first box and the sequence 
is repeated. The evident regularity clearly 
indicates that the behavior is rule governed, 
but the nominal stimuli on what have tradi- 
tionally been called "trials" are all equiva- 
lent. If our definition of rule-governed be- 
havior as a function is to apply, there 
should be at least as many effective stimuli 
as there are responses. 

A closer look at what is happening shows 
that it would be highly unlikely for a sub- 
ject to give the appropriate response on any 
given trial if he did not know (i.e., remem- 
ber, be told, etc.) what the correct response 
was on the previous trial. Thus, the effec- 
tive stimuli (i.e., the set of invariant stimu- 


*The inclusiveness of a rule-governed class is 
determined by the number of test instances it con- 
tains. One rule-governed class is said to be more 
general than another if it includes all instances of 
the latter plus some of its own. For related experi- 
mental results, the reader is referred to Scandura, 
Woodward, and Lee (1967) and Scandura and 
Durnin (1968). ; i 

* For example, in a recent pilot experiment con- 
ducted in our laboratory (by Don Voorhies), many 
high school subjects were able to draw the correct 
inference when the premises (and conclusions) 
were meaningless but were unable to do so when 
they were meaningful and contrary to everyday 
experience. Other subjects were able to give the 
appropriate conclusion when the first premise was 
written in the form, If A, then B, but not when it 
was written in the form, B whenever A. 
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lus properties required to elicit the effective 
response of putting an X in the appropriate 
box) must have been a composite of the 
effective aspects (ie., invariant properties) 
of the six boxes with an imposed stimulus 
trace (which in our terminology are proper- 
ties) indicating the previously correct re- 
sponse. Notice that the simple expedient of 
marking each nominal stimulus (i.e., row of 
empty boxes) to indieate the previously 
correct response would eliminate this addi- 
tional strain on the memory. (Such behav- 
ior may be said to involve sequential rules.) 

In a situation like this, it is not always 
immediately clear exactly what constitute 
the effective stimuli and what constitute 
the effective responses. Nonetheless, they 
usually can be identified, as in this exam- 
ple, by specifying exactly what the subject 
is supposed to do (including the variations 
in his performance that are allowable) and, 
in turn, exactly what stimulus variations 
might be allowed at the time the required 
behavior is to take place without effectively 
changing that behavior. In the case where 
the effective stimuli are entirely internal, of 
course, the defining properties (of these 
stimuli) must be determined on strictly hy- 
pothetical grounds by asking the question, 
What properties might the subject have 
used in generating each response? Notice 
that this is precisely the question we asked 
above to determine that the subject must 
have remembered his last response. (In ask- 
ing this question, the experimenter must 
have explicitly in mind one or more rules 
which would generate these responses.) 

Consider another example. In one of the 
experiments conducted by Haygood and 
Bourne (1965), subjects were presented 
with nominal stimuli to be sorted and the 
appropriate attributes for sorting the stim- 
uli into concept exemplars and nonexem- 
plars. To sort the stimuli correctly, the sub- 
jects had to determine the appropriate logi- 
cal rule for combining the attributes to gen- 
erate categories. An interesting result of 
this experiment was that the subjects 
tended to improve on subsequent concept- 
sorting tasks that involved the same logical 
rule, but new attributes. 

How might this behavior be interpreted 
in the terms we have discussed? Let us look 


first at the behavior a subject might be ex. 
pected to elicit on a given concept task 
after the concept has been learned, In this 
case, the effective stimuli would appear to 
correspond directly to the exemplars and 
nonexemplars used. The effective responses 
correspond to the two sorting operations 
(i.e., the two piles). 

Further, in this type of concept attain- 
ment task, a subject also might learn how 
to combine given pairs of simple attributes 
(properties), such as red, triangle, into the 
corresponding logical categories (e.g., red or 
triangle, large implies blue, etc.) associated 
with that type of task. For example, if the 
logical connective involved is a conjunction, 
then the corresponding logical rule would 
associate each pair of given attributes, say, 
red, large, with the corresponding conjunc- 
tion, red and large. a 

This simple observation provides a basis 
for explaining the progressive improvement 
found by Haygood and Bourne (1965) on 
tasks involving the same logical rule. In 
particular, we seek to describe the rule-gov- 
erned behavior of a subject who, given only 
the relevant attributes, (almost) never 
makes a mistake in sorting on a given class 
of tasks. In this case, the effective responses 
again correspond to the two sorting loca- 
tions used in the tasks. The effective stim- 
uli, however, constitute a much broader 
class. In order to sort correctly an arbitrar- 
ily given nominal stimulus (i.e., exemplar 
or nonexemplar), the subject obviously 
would need to know the relevant attributes. 
The effective stimuli, then, would necessal- 
ily have to include the particular attributes 
that are relevant to the problem (and whic 
Haygood and Bourne made directly availa- 
ble) in addition to the nominal stim | 
themselves. This analysis differs somewha 
from my latest formulation (Scandul 
1971a; 1972). 


CHARACTERIZATION OF A RULE 


In a series of earlier papers (Scandurt 
1966, 19672, 1968, 1969b, 1970) I have Dd 
posed that rule-governed behavior can die 
accounted for by a construct (I used r 
term rule) which can be characterized a8 a 
ordered triple (D, O, R), where D Td 
those properties of effective stimuli whi 
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determine responses and O, to the operator 
by which these properties are transformed 
into the respective responses, R. 

Since R is uniquely determined by D and 
0, we could just as easily speak of the pair 
(D, O), but the former characterization is 
more reminiscent of the way functions (in 
the mathematical sense) have been defined 
classically. That is, a function may be 
thought of as consisting of a domain, D, a 
range, R, and a rule or operator, O, connect- 
ing them so that there is a unique element 
in the range for each element in the domain. 
(This definition is equivalent to one in 
terms of ordered pairs of inputs and out- 
puts.) 

The ability to add (two numbers), for 
example, can be thought of as a triple in- 
cluding: (a) the set of ordered pairs of 
numbers (the elements in D which deter- 
mine the unique responses), (b) the set of 
sums (the numbers in R), and (c) an oper- 
ie which maps the first set into the sec- 
ond, 

In general, more than one rule may gen- 
erate the same rule-governed behavior. 
Thus, in the addition example, the “under- 
lying" rule might be any one of the follow- 
Ing: (a) Some version of the usual addition 
algorithm; (b) increment the first number 
as many times as the second; or (c) select 
two (arbitrary) disjoint sets corresponding 
to the two given numbers, form the union 
(ie, form the set which includes all and 
ly those elements in the two sets), and, 
then, count the number of elements in the 
Union, Although they all work, the three 
tiles (or procedures) so defined differ 
vid in efficiency. Furthermore, “know- 
pu One of the rules does not necessarily 
did knowledge of the others. Thus, a 
an child may be able to give the sum of 
a eral) numbers, especially when pre- 
M with concrete embodiments of the 
gers) i (eg, in the form of extended fin- 
thing’ ut he may have no idea that such a 

5 as an addition algorithm even exists.5 
pde. 


s 
to 5 ou also be noted that D need not refer 
mq: the invariant properties of the effective 
the same m thermore, different rules underlying 
On diffen e-governed class of behaviors may act 

ent properties of the same effective stimuli, 


k 
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Characterization of Concepts and 
Associations 


As was the case with rule-governed be- 
havior, the proposed definition of a rule is 
compatible with conceptual and rote behav- 
ior. For example, accounting for sorting be- 
havior—that is, distinguishing between ex- 
emplars (e.g., red objects) and nonexem- 
plars (e.g., nonred objects)—requires a rule 
of the sort: If red, put in container A ; oth- 
erwise, put in container B. The type of rule 
underlying rote behavior is even simpler— 
for example: Say ZUG when shown MUR. 


The Importance of Identifying Underlying 
Rules 


It is my conviction that failure to be 
aware of differences in the way rule-gov- 
erned behavior may be generated has some- 
times led to unnecessary confusion in expe- 
rimental research on complex learning. Con- 
sider, for example, the important problem 
of number conservation. Several decades 
ago, Piaget observed that some young chil- 
dren, between the ages of five and seven, 
were able to compare the number of objects 
in two collections correctly when they were 
arranged in some ways but not in others. 
For example, even the young child will say 
that two collections contain the same num- 
ber of objects when the respective elements 
are physically paired in a one-to-one fash- 
ion. If, however, the objects in one collec- 
tion are spread out so that they appear to 
cover a greater area, the nonconserving 
child typically will say that this collection 
contains more objects. Piaget has main- 
tained that conservation of number devel- 
ops gradually and tends to appear spon- 
taneously with time, due to a wide range of 
experiences, and is not subject to special- 
ized training. 

During the past few years, any number 
of investigators have attempted to prove 
Piaget wrong. In a recent study, for exam- 
ple, Wallach and Sprott (1964) were appar- 


In summing arithmetic number series, for example, 
the EDDIE by the formula, [(4 + L)/2]N, 
involves only the first term, the last term, and the 
number of terms, which are properties, whereas 
“sequential addition" involves every number in the 
series. 
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ently able to make conservers out of non- 
conservers during the course of a short-term 
experiment. They trained subjects, through 
standard reinforcement procedures, to re- 
spond correctly whenever one of the collec- 
tions was transformed in one of several 
ways. The transformations included rear- 
ranging the objects, removing objects, and/ 
or adding objects. The experimenters were, 
indeed, successful in training their subjects 
on this task. Furthermore, the subjects were 
able to transfer their new-found ability to 
other conservation tasks in which the mate- 
rials were changed. 

Do the data of this experiment contradict 
Piaget’s contention that conservation of 
number is not subject to (short-term) ex- 
perimental intervention? On the surface, 
this would appear to be the case. A more 
detailed look at the experimental task, how- 
ever, indicates that the experimenters might 
not have included a crucial test of conser- 
vation. In particular, number conservation 
corresponds precisely to the mathematical 
notion of cardinal number; each cardinal 
number may be defined as the class of all 
sets which can be put into one-to-one corre- 
spondence with one another. A subject who 
has learned to conserve number has the 
ability to determine whether nor not it is 
possible to pair in a one-to-one fashion the 
objects in two (or more) arbitrary collec- 
tions. 

The subjects in the Wallach and Sprott 
(1964) study easily could have learned to 
give tne correct responses without this sort 
of determination. That is, it is quite possi- 
ble that they could have learned a simple 
rule, which, nonetheless, still made it possi- 
ble to respond correctly to the tasks pre- 
sented. All they had to do was remember 
the previous response (yes or no) and ob- 
serve the sort of transformation the experi- 
menter made. If the previous response was 
yes and a rearrangement was performed, 
the appropriate response would still be yes 
—the same would be true for no. If the 
experimenter added to or subtracted from 
one of the collections, the appropriate re- 
sponse could be determined with equal ease. 
In effect, the subjects could have responded 
to the conservation tasks they were pre- 
sented with in a way similar to the “six 


JOSEPH M. SCANDURA 


boxes” example above. In both cases, mak- 
ing a correct response would depend on re- 
membering the previously correct response, 

Contrast these tasks with one the investi- 
gators did not consider—that of adding to 
or subtracting from one of the collections 
without the subject’s knowledge, together 
with rearrangement where necessary to 
avoid correct responding on the basis of di- 
rect perception. Unlike the experimental 
tasks, a subject would be unable to give the 
correct responses without going through the 
process of one-to-one pairing or some 
equivalent, such as counting. 


CONCLUSIONS 


Definition of rule-governed behavior as & 
function includes rote (association) and 
conceptual behavior as special cases. In 
analyzing more complex forms of behavior 
it is argued that inputs frequently involve 
internal and/or suppressed external stimu- 
lation in addition to the nominal stimulus 
itself. In particular, where the nominal 
(overt) stimuli are identical, definition. of 
rule-governed behavior as & function 
(where there is a unique response corte- 
sponding to each stimulus) frequently 
makes it possible to identify internal and/ 
or suppressed stimulation which enters into 
the responses. This is accomplished by ask- 
ing what, in addition to the specified overt 
stimulus, must be remembered and/or iden- 
tified in order that there be (at least one) 
distinct effective stimulus for each (effec 
tive) response. Identifying effective stimuli 
(i.e., properties of nominal stimuli, possibly 
together with properties of internal and/or 
suppressed stimulation) provides a basis for 
more complete “stimulus contro ” gn 
makes it possible for an experimenter 
partial out the effects of memory (where 
internal stimulation is involved) and/or 
suppressed information (where information 
is critical in determining responses ut 18 
not identified as part of the [nominal] stim- 
ulus). 

Distinguishing between rule-governed j^ 
havior and rules as constructs may also : 
critical in formulating and interpreting e 
search because more than one rule (in ne 
a denumerably infinite number) may e 
derlie the same rule-governed class. ( 


most research on rule learning it is implic- 
itly assumed that only one rule is in- 
yolved.) Explicit identification of the var- 
ious rules that could reasonably be expected 
to account for the rule-governed behavior 
can provide clues to the experimenter con- 
cerning how to distinguish among the rules 
(Le, how to determine What (rule) is 
learned; cf. Scandura, 1970, 1971a). 
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CONDITIONS UNDER WHICH FEEDBACK FACILITATES 
LEARNING FROM PROGRAMMED LESSONS 


RICHARD C. ANDERSON; RAYMOND W. KULHAVY; ano THOMAS ANDRE 


University of Illinois 


One hundred and nineteen subjects completed & programmed intro- 
duction to population genetics on PLATO, a computer-based educa- 
tion system. On the criterion test which followed, performance was sig- 
nificantly better when feedback had been provided after, rather than 


before, the response. 


Study after study has seemed to show 
that self-instruetional programs teach as 
well when immediate feedback is omitted. 
One explanation is that the procedures em- 
ployed in the previous studied have often 
permitted students to copy the answers into 
the blanks without reading the material in 
the frames. Krumboltz and Weisman 
(1962) mimeographed their program on 
lightweight paper through which the feed- 
back on the next page could be read. At 
least two of the previous studies employed 
programmed texts in which the frames were 
printed in consecutive order down sheets of 
paper (Lublin, 1965; Rosenstock, Moore, & 
Smith, 1965). The correct answers appeared 
immediately below the frames. This ar- 
rangement made accidental exposure of the 
answers difficult to avoid, and of course 
made deliberate cheating easy. 

Anderson, Kulhavy, and Andre (1971) 
completed two experiments using a comput- 
er-controlled system which insured that 
feedback was not available until after the 
student responded. In each case, students 


1A version of this paper was read at the 
annual meeting of the American Educational 
Research Association, New York, February 1971. 
The research reported herein was supported in 
part by the Advanced Research Projects Agency 
through the Office of Naval Research under Con- 
tract ONR Nonr 3985(08). The authors are grate- 
ful for the assistance of Bobbie Schmidt. 

? Requests for reprints Should be sent to Rich- 
ard C. Anderson, Training Research Laboratory, 
296 Education Building, University of Illinois, 
Urbana, Illinois 61801. 

2 Now at Arizona State University. 


who completed a 100-frame program on the 
diagnosis of myocardial infarction from 
electrocardiograms learned significantly 
more when feedback was provided than 
when it was not. For one group in the seo- 
ond experiment, feedback was presented in 
the lower right corner of the frame at the 
same time the rest of the frame was ex- 
posed. This group learned substantially less 
than the 100% feedback group, indeed even 
less than the 0% feedback group, despite 
the fact that the instructions repeatedly 
stressed that the student should enter his 
answer before he looked at the correct an- 
swer. : 
Several studies using teaching machines 
have failed to show a significant advantage 
for feedback, even though the machine pre- 
sumably prevented the student from seeing 
feedback before he responded (e.g., Moore 
& Smith, 1964). A reasonable hypothesis 18 
that feedback failed to show to advantage 
because the programs used in these studies 
contained many copying frames and were 
otherwise heavily prompted. Insuring that 
the student responds before feedback is pro- 
vided will prevent him from turning 4 pro- 
gram into a series of copying frames, but it 
stands to reason that such control will not 
help if the program already consists largely 
of copying frames. The chief purpose of the 
present research was to test this hypothesis. 


METHOD 


Subjects 


The subjects were 119 summer students en 
rolled in an educational psychology course. 
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majority were volunteers; the remainder partici- 
ated to fulfill a course requirement. All of the 
potential subjects were given a verbal aptitude 
measure, the Educational Testing Service Wide 
Range Vocabulary Test, and a five-item test on 
the arithmetic of proportions. The subjects were 
randomly assigned to conditions from within three 


verbal aptitude strata. 


Material 


The standard program consisted of the first 104 
frames adapted from a programmed introduction 
to population genetics (Faust, Anderson, Guthrie, 
& Drantz, 1969). The subject matter was un- 
familiar to most of the potential subjects, and the 
program is known to teach effectively. The copy- 
ing program was identical to the standard program 
except that almost every frame was turned into a 
copying frame. 

The criterion test consisted of two short-answer 
items for which a total of 13 points were awarded 
and 81 four-alternative multiple-choice questions. 
Thirty-five of the points were awarded for ques- 
tions that required the students to apply a con- 
cept or principle to an example different from any 
contained in the program, or entailed identifi- 
cation of a paraphrased concept or principle. 


Design and Procedure 


Three groups received the standard program, 
one without feedback (0% standard), one with 
feedback. after every frame (100% standard), and 
one with feedback continuously in view (peek 
standard), The instructions for the latter group 
| Stated three times that the subject should compose 
own answer before he looked at the feedback. 
0 groups received the copying program. The 0% 
copying group got no feedback whereas the 100% 
Copying group received feedback following the 
Rone to every frame. The feedback appeared 
eneath a row of Xs, immediately below the spot 

Where the response was typed. 
e he experiment was conducted on PLATO, a 
K eaen basod instructional system located at 
il Des of Illinois, which consists of a Con- 
fat 9p ata Corporation 1604 computer connected 
n Student Stations housed in semi-enclosed 
Du» Each station contains a 12-inch-square 
1 n Screen upon which a stimuli are 
fe 5 pope a keyboard upon which responses 

he subjects were run in 

groups of 14-18. Of 
ow: PLATO allowed individualized treatment 


of each subj 
ect. i 
Tepresented ject. All experimental groups were 


Instructions 1n each session. PLATO presented the 


Sented th appropriate for each condition, pre- 
feedback: © appropriate program, controlled the 
blloving tontingencies, presented a questionnaire 
Program, a 

tion 
Peneil fo; 


e program, recorded errors during the 
Dd recorded time on each frame. The 

test was administered in paper-and- 
nm in a nearby room. 
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RESULTS AND DISCUSSION 


Table 1 contains mean correct on the cri- 
terion test for each group. An overall analy- 
sis of variance indicated no significant dif- 
ferences among groups (F — 1.29, df — 4/ 
114), and only one of three preplanned 
comparisons was significant. An analysis of 
covariance, with the verbal aptitude and 
arithmetic scores as covariates, showed ex- 
actly the same picture. 

As expected, the 100% standard group 
performed significantly better on the crite- 
rion test than the peek standard group (t = 
1.97, df = 46, p < .05). This confirms pre- 
vious research which used a different lesson 
(Anderson et al, 1971). Thus, the data 
warrant a measure of confidence in the ge- 
neralizability of the proposition that feed- 
back in programmed lessons facilitates 
learning if available after, but not before, 
the response. 

We had also expected on the basis of our 
previous research to find that feedback 
after every response was better than no 
feedback, since the PLATO system pre- 
cluded accidental or deliberate exposure of 
feedback before a response. The 100% 
standard group did perform somewhat bet- 
ter on the criterion test than the 0% stand- 
ard group, but the difference was not signif- 
icant (t = 1.45, df = 46, 10 > p > .05). 

Anderson, Faust, and Roderick (1968) 
showed that effectiveness is undermined 
when a program is altered so that all of the 
frames are copying frames or otherwise 
heavily prompted. Consequently, it was 
predicted that the 100% standard group 
would perform better on the test than the 
100% copying group. The trend of the data 
was in this direction, but, once again, the 


TABLE 1 
Mean Correct on THE Criterion TEST AND 
Mean PROGRAM ERRORS 


Group 
ee 100 0% | Peek | 4 a 
Cep stand- d Pus Ec 
Criterion test 
URN 27.0 | 23.8 | 22.6 | 24.0 | 22.7 
24.6 | 27.3 | 15.6 | 21.3 | 17.2 


Program errors 
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difference was not significant (t = 1.34, df 
= 47,.10 > p > 05). 

As Table 1 shows, there were significant 
differences among groups in the frequency 
of errors during the program (F = 3.18, df 
= 4/114, p < .05). Errors were much less 
frequent in the peek standard group than in 
the other groups that received the standard 
program, which confirms our previous find- 
ings (Anderson et al., 1971). We argue, as 
we previously did, that this must be be- 
cause subjects indeed peeked at the correct 
answer, sometimes copying the answer 
without studying carefully the material in 
the frame. In support of this interpretation 
were the comments on an open-ended post- 
experiment questionnaire of many of the 
subjects who completed the peek program. 
One wrote, “My temptation to look at the 
answer before working the problem was 
great.” Another said, “It is very stupid to 
show the correct answer before the answer 
is solicited. The statement of the answers 
makes all of the presented questions extra- 
neous material.” 
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SOME EFFECTS OF REINFORCEMENT ON ACHIEVEMENT 
AND BEHAVIOR IN A REGULAR CLASSROOM’ 


No. 3, 189-193 


GEORGE WALKER ROSENFELD* 
University of Minnesota 


Sixty sixth-grade students were reinforced for passing tests on their 
regular arithmetic curriculum. The number of tests passed by these 
60 students under regular classroom reinforcement was compared to 
the number passed under chart reinforcement, monetary reinforce- 
ment, and monetary plus chart reinforcement, There was a significant 
improvement for the total class and for the middle-IQ students dur- 
ing monetary plus chart reinforcement. High-IQ students improved 
under monetary and monetary plus chart reinforcement. Low-IQ 
students showed no improvement. Most students spontaneously com- 
peted with someone of equal competence. Irrespective of success, 
students positively evaluated their year’s progress. It is concluded that 
(a) the addition of reinforcements to a regular classroom curriculum 
resulted in improved performance for many students, and (b) im- 


provement was positively related to IQ. 


Studies of the application of reinforce- 
ment techniques to improve academic per- 
formance support the commonsense view 
that incentives improve learning or per- 
formance (Staats & Butterfield, 1965; 
Tyler & Brown, 1968; Wolf, Giles, & Hall, 
1968), Laboratory studies with children, 
however, indicate that the addition of ex- 
trinsic reinforcement may at times result in 
4 decrease in performance (Cantor & Hot- 
tle, 1955; Miller & Estes, 1961). 

The present investigation extends such 
E by determining how the introduction 
9! Several extrinsic reinforcers into a nor- 
mal classroom ould affect performance. 
to © Importance of intelligence as a media- 

of the effectiveness of reinforcement was 
also examined. 
ei teachers believe that high-IQ stu- 
tae corsa work up to capacity, while 
xS d ow-IQ students could improve their 

ormance if they would apply them- 
SS, 
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selves. By analyzing the effect of reinforce- 
ment on the performance of students of dif- 
ferent intellectual levels, an operational 
definition of “working at capacity” was de- 
rived and an empirical test of the teacher's 
belief was undertaken. 

Another purpose of this study was to ex- 
amine the effect of reinforcement and 
achievement on self-evaluation, especially 
in a classroom in which achievement was 
clearly, objectively, and often publicly 
measured. Common sense predicts that 
high-achieving students would see them- 
selves as more successful than low achiev- 


ers. 
METHOD 


Subjects 

The subjects were the 60 sixth-grade students 
attending an all-white middle-class Catholic school 
in Minneapolis, Their Lorge-Thorndike IQs ranged 
from 84 to 135 with a mean of 108. They had math 
class in two groups of 30. Students were assigned 
to these 40-minute morning classes according to 
their reading achievement. scores. The first class 
contained the top and bottom reading groups, 
while the second class contained the middle groups. 


Tests 


A widely used sixth-grade arithmetic book 
(Seeing Through Mathematics, Chicago: Scott 
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Foresman, 1969) was divided into 51 sections 
which the experienced teacher felt were of equal 
difficulty and would take most students 20 minutes 
to read and understand. Three alternate forms of 
equal difficulty of a 25-minute test were con- 
structed for each section. 

The first 3 weeks of the school year were spent 
reviewing the previous year's work. Students were 
required to pass a review test. Once passed, the 
student could begin working on the sections of the 
sixth-grade book and proceed at his own pace. 
Some students passed the review test immediately 
and others took longer. By the end of the 3 weeks, 
all students were working on the sections in the 
book. 'The fourth week of school was the first week 
of the "regular classroom reinforcement condition." 

Students were asked to learn the material in the 
section to which they had advanced. After learning 
the material, they were required to pass a test in 
class covering the section. To pass the test the 
child was required to demonstrate approximately 
85% competence on each section of the test if 
they passed a test, they could go on to the next 
section. If they received a “take-over,” they were 
required to wait until ihe next day and then take 
another form of the same test. After taking all 
three alternate forms, they were required to start 
over again until they passed the test. Students 
were encouraged to solve problems on their own 
and to get help from each other or the teacher 
when necessary. Since students were working on 
different tests and different levels, most help was 
given on an individual or small group basis. Most 
help was given to the lower-IQ students. 


Reinforcement 


Four reinforcement conditions were used: regu- 
Jar classroom reinforcement, monetary reinforce- 
ment, chart reinforcement, and monetary plus 
chart reinforcement. The first 5 weeks following the 
review period were regular classroom reinforcement 
weeks. À week was 5 consecutive days in which 
math class was held. At the start of each class, the 
students were told which type of week it was and 
the number of days left in the week. During the 
regular classroom reinforcement period, besides 
peer responsiveness, the students received no addi- 
tional reinforcement other than passing a test and 
the encouragement and social approval of their 
teacher, which was the same throughout all condi- 
tions. 

'The following 3 weeks were for monetary rein- 
forcement, The students were told at the start of 
the week that for passing one test during the week 
they would receive 5¢. Two tests were worth 156; 
three tests were worth 30¢; four tests, 506; five 
tests, 75€; six tests, $1.05; seven tests, $1.40; eight 
tests, $1.80, etc. They were paid at the end of each 
week. Approximately $125 was distributed over 29 
weeks. 

'The next 2 weeks were for monetary plus chart 
reinforcement. The children were shown a large 
chart with spaces next to each name. They were 
told that in addition to money, & star would be 
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placed next to their name when they passed 
so they could keep track of their RE 
week a new chart was started. 

'The next 2 weeks were for chart reinforcement, 
The children were told that money would be dis. 
continued and that the chart would remain, 

Throughout the year the order of these four 
conditions was varied. They were next presented in 
2-week segments in the reverse order, Then the 
three extrinsic reinforcement conditions were pre- 
sented in 2-week segments, each separated by | 
week of regular classroom reinforcement. Over the 
entire school year, there were 10 weeks of regular 
classroom reinforcement, 7 weeks of monetary 
reinforcement, and 6 weeks each of chart, and 
monetary plus chart reinforcement. 


RzesurTS 


Achievement 


Table 1 shows the average number of 
tests passed during each week under the 
four reinforcement conditions. The table 
gives these figures for the 10 highest-IQ, the 
10 lowest-IQ, and the 10 middle-IQ stu- 
dents and for all 60 students. 

Matched-pair ¢ tests were used to find 
differences between regular classroom rein- 
forcement and each of the three other con- 
ditions (see Table 1). The total class im- 
proved their rate of progress only when 
both money and chart were present. Low- 
IQ students showed no differences over the 
four conditions. In comparison to regular 
classroom reinforcement, the middle-IQ 
students showed a significant increase under 
the money plus chart condition, and high- 


TABLE 1 
AvERAGE NuwBER or Tests Passen PER WEEK 
UNDER Four REINFORCEMENT CONDITIONS FOR 
THE Hicuest, Mippiu AND LOWEST 
STUDENTS AND FoR THE ToTAL CLASS 


Conditions of reinforcement 


Dhergeme UT 
Students’ IQ lev - Min 
Bus Rie | Mone- Cherie 
pares 
IQ > 120° 1.9 |2.4**| 2.1 pur 
106 € IQ < 110^ | 1.35 | 1.29 | 1.48 bee 
IQ < 93° 82) .77| 75 | eens 
Total class? 1.34 | 1.39 | 1.42 |- 
an = 10. 
bn = 60. 
* p< .05. 
PLO 


—— 
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O===220 High 
Om==0 Middle 


4 
9 


Average Tests Passed 
Per Week 


Condition RC M MC € RC C 
Weeks 5 3 2 2 | 2 


191 


MC M RC M RC MC RC C RC 
Dube tale AE 


Fro. 1. Average number of tests passed per week during successive periods of regular 
classroom reinforcement (RC), monetary reinforcement (M), monetary plus chart rein- 
forcement (MC), and chart reinforcement (C) for high-IQ, middle-IQ, and low-IQ groups. 


IQ students benefited from money plus 
chart and from monetary reinforcement. 
There was almost no overlap in mean 
Performance of the three IQ groups 
throughout the year (see Figure 1). When 
teinforced, however, middle-IQ students 
often exceeded the regular classroom per- 
formance of high-IQ students. 
he the number of tests passed during 
; e first 5 weeks of regular classroom rein- 
howd was compared to the number of 
s passed under regular classroom rein- 
orcement at the end of the year, matched- 
i t tests showed no significant differences 
2 either high-, middle-, or low-IQ levels (és 
uad —22, —.62, respectively). A 
SPUR end t-test comparison was also 
the int the number of tests passed during 
dition 2 weeks of each reinforcement con- 
tee with the number passed during the 
ihn, E of that type of reinforcement 
Nom, e. and low-IQ students. There 
ino E ifferences except the high-IQ stu- 
during TAR a significant improvement 
oe monetary plus chart condition (¢ 
36, df=9n< .05). 
Competition 


pstionnaires distributed several times 
£ the term revealed that of the 60 chil- 


dren, 45 stated that they had chosen to 
compete with another student in their class 
during the year. Thirty-five of the 45 com- 
peted for more than half the school year. 
There were 24 children who were part of 
reciprocal pairs that lasted for over half the 
term. Electing to compete was not related 
to the number of tests passed during the 
year. The number of competitions, the num- 
ber of reciprocal competitions, and the du- 
ration of the competitions were also inde- 
pendent of achievement level. 

Children tended to compete with a 
same-sex partner who had passed a similar 
number of tests. A Pearson product-moment 
coefficient of correlation (r = .89) was 
found between the achievement ranks of the 
self-selected competing pairs. 


Self-Evaluation 


Students were asked to evaluate their 
progress during the year and at the end of 
the year by answering the question, How 
did you do in math this year? The student 
circled one of four answers: very well, well, 
poorly, very poorly. The answers to this 
and several other questionnaire items 
showed little change over the year. Regard- 
less of the number of tests passed, the stu- 
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dents indieated that they did well or very 
well. 


Discussion 


The total class showed improved per- 
formance over regular classroom reinforce- 
ment during the money plus chart condi- 
tion. Therefore, the use of these reinforce- 
ments without the introduction of a special 
curriculum can increase the rate of achieve- 
ment of many students, 

Level of intelligence was related posi- 
tively to the number of reinforcers that 
were effective, as well as to the amount of 
improvement under each reinforcement. In 
comparison to regular classroom reinforce- 
ment, the high-IQ group increased its al- 
ready superior rate by almost 50% under 
the combined reinforcement condition. 
Under the monetary and chart condition, 
the high-IQ students passed significantly 
more tests at the end of the year than at the 
beginning. Not unlike the results of a grad- 
ing system, the higher the IQ of a student, 
the better able was he to progress when of- 
fered extrinsic rewards. This is particularly 
interesting since offering extrinsic rewards 
other than grades for increased performance 
is usually reserved for the disturbed or dis- 
advantaged student and rarely is found in 
the normal classroom. 

There was no improvement in the low-IQ 
group under any of the reinforcement con- 
ditions or within reinforcement conditions. 
This lack of improvement may have oc- 
curred not because the reinforcements were 
not desired by these students, but because it 
would have required more effort for them to 
improve than for the brighter children. 
Clark, Lachowicz, and Wolf (1968) have 
shown that students with IQs similar to the 
low-IQ students in this study can better 
their performance under reinforcement; 
however, they were working with curricula 
that required less effort to improve than 
that used in this study. It is also possible 
that the low-IQ students did not improve 
because the reinforcement system may have 
favored the high-IQ students in that a typi- 
cal 50% increase in productivity for the 
high-IQ group might have been worth 15¢, 
whereas a typical 50% increase for a low- 
IQ student may have been worth about 5¢. 
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identify what particular factors accounted 
for the lack of improvement of these low-IQ 


Further experimentation is required to 
students. 


Assume that a reinforcer is desired by a | 
group of children. If the addition of this 
reinforcer results in improved performance, 
then these children were not fully applying 
the skills they already possessed prior to 
being reinforced. The introduction of rein- 
forcement offers the teacher an operational 
definition of “working at capacity.” The in- 
ability to improve performance, given that 
the student is highly motivated to do so 
within a particular curriculum, is an indica- 
tion that the student was already working 
at capacity prior to reinforcement. Since 
the high- and middle-IQ groups improved 
under reinforcement, it would follow that 
previously they had not been working at 
capacity with the curriculum. This is con- | 
trary to a common assumption that it is the 
low-IQ students who are loafing. 

By introducing highly desired reinforcers 
into a curriculum, two groups can be identi- 
fied. There is a group, not working at ca- 
pacity before reinforcement, that is capable 
of improving its performance without extra 
assistance from the teacher; and a group, 
working at capacity, that does need assist- 
ance from the teacher or a change in curie- 
ulum in order to improve performance. Fur- 
ther experimentation is required to identify 
what specific behaviors were being rem- 
ioreed in the children who improved. li 
would be very interesting to know if the 
reinforcement affected such behaviors 898 ' 
time spent studying, help seeking, and at- 
tention and cognitive strategies during 
preparation and test taking. ; 

High-achieving students could not be dif- 
ferentiated from students who passed few 
tests during the year on the basis of thet 
questionnaire descriptions of their perform 
ance or attitude toward math. This could be 
accounted for by the spontaneous compel” 
tion which pervaded all levels of perform- 
ance. Since students competed with other 
who passed a similar number of tests, m 
feelings of success reported by the low-® 
chieving students may have been due A 
their evaluating their performance in Ie is | 
tion to their competitor, rather than to 
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class as a whole. Competition for the low-a- 
chieving student may, therefore, not always 
lead to low self-esteem and problems but 
may lead to an experience of success. 

Anderson (1967) attributed the reluc- 
tance of educators to employ tangible rein- 
forcement to the belief that students will 
become dependent upon it and will not per- 
form without it. This belief was not sup- 
ported, since students’ performance during 
regular classroom reinforcement was un- 
changed from before to after being exposed 
to the extrinsic reinforcers, and since there 
was not a decline in performance within the 
less powerful reinforcement conditions over 
the year. This study demonstrates that 
without changing the normal curriculum, 
the introduction of reinforcement can result 
in improved performance for the average 
and the brighter students, without causing 
nonreinforced or less highly reinforced be- 
havior to suffer. Since reinforcement alone 
can help these students to improve their 
performance, the teacher can be freed to 
work more intensively with those students 
who need assistance. 
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EFFECTS OF EXPERT ENDORSEMENT OF BELIEFS ON 
PROBLEM-SOLVING BEHAVIOR OF HIGH AND 


LOW DOGMATICS' 


CHARLES B. SCHULTZ anv FRANCIS J. DI VESTA? 
Pennsylvania State University 


The effect of authority endorsement of hints on the ability of high and 
low dogmatics (n = 90) to solve a pencil-and-paper version of the 
Denny Doodlebug problem was examined. At five points during the 
problem-solving period, the subjects were presented two alternatives, 
one of which was appropriate for the problem solution (new beliefs) 
and one of which was inappropriate (old beliefs). In one condition, 
the new-belief alternative was reputed to be endorsed by the majority 
of experts on problem solving, while in a second condition, the old- 
belief alternative was said to be endorsed. In the control group, 
neither alternative received endorsement. The effect of old-belief en- 
dorsement was to facilitate analysis and synthesis for low dogmatics 
and to reduce their “errors” relative to high dogmatics. New belief 
endorsement had the opposite effect. The findings were interpreted in 
terms of the high and low dogmatics’ relation to authority and to task 


requirements. 


Rokeach (1960) analyzed problem-solv- 
ing behavior into two phases: analysis of 
the problem into relevant parts or “beliefs” 
and synthesis or integration of these beliefs 
in unique ways or systems in order to arrive 
at a solution. During the analysis stage, the 
problem solver must typically reject his ex- 
isting inappropriate beliefs in favor of new, 
more appropriate beliefs. Influences such as 
those associated with set (Luchins, 1942) 
and functional fixedness (Duncker, 1945) 
are especially detrimental in the analysis 
stage. During the synthesis stage, the prob- 
lem solver must take these new beliefs and 
transform, integrate, or otherwise organize 
them into a novel or unique system. Among 
the variables that affect synthesis are those 
cognitive, motivational, or personality char- 


*The research reported in this paper was sup- 
ported in part by the Advanced Research Projects 
Agency (ARPA Order No. 1269) through the 
United States Office of Naval Research under Con- 
tract ONR Nonr N00014-67-A-0385-0006. 

? Requests for reprints should be sent to Francis 
J. DiVesta, Department of Educational Psychol- 
ogy, Pennsylvania State University, 201 Social Sci- 
ence Building, University Park, Pennsylvania 
16802. 


acteristics of the problem solver which 
restrain or limit integration, an example of 
which is dogmatism. : 

Persons characterized by a closed-belief 
system (ie., high dogmatics) presumably 
are able to discard individual existing be- 
liefs for new beliefs. However, they fail to 
integrate new beliefs into their system, 65" 
pecially if the new beliefs differ radically 
from their existing belief system (Fillen- 
baum & Jackman, 1961; Rokeach, 1960). 
In contrast, persons characterized by 9n 
open-belief system (i.e., low dogmaties) ex 
perience relatively little difficulty during 
synthesis. Furthermore, the ability of high 
dogmatics to evaluate information inde 
pendent of its source is limited severely bY 
their absolute authority beliefs. As à result, 
high dogmaties uncritically depend Upon 
authority. Low dogmatics, on the other 
hand, tend to evaluate information on the 
basis of its objective validity as well as i 
reliability of the source. They tend not 
be influenced by an authority merely be- 
cause of the power it represents (Erlich h 
Lee, 1969; Restle, Andrews, & Roke8c" 
1964; Rokeach, 1960). 
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PROBLEM-SOLVING BEHAVIOR OF HIGH AND LOW DOGMATICS 


On at least one occasion, Rokeach (1960) 
reported an exception to the aforegoing dis- 
tinctions between high and low dogmatics 
in which high dogmatics synthesized rele- 
vant beliefs and reached a problem solution 
more quickly than did low dogmatics. In 
this instance, subjects were presented all 
the new beliefs required to solve a problem 
(ie. the “silver-platter” condition) thereby 
eliminating the necessity for analysis. Thus, 
the presentation of new beliefs removed the 
“main psychological obstacle” in the prob- 
lem solving of high dogmatics because they 
need not compromise their older everyday 
beliefs by acceptance of new beliefs from a 
presumed expert (ie. from the experi- 
menter). On the other hand, low dogmatics 
tend to delay acceptance of new beliefs 
presented in this fashion until they have 
been judged as appropriate for the task. 
Within this framework, high dogmatics are 
viewed as “blindly” accepting new beliefs 
attributed to an authority, while low dog- 
matics resist having authority endorsements 
“rammed down their throat.” One implica- 
tion that may be drawn from the above 
argument is that both high and low dog- 
matics are authority oriented; the former 
tend to accept authority while the latter 
initially reject it. 

_Based on this rationale, it was hypothe- 
sized that the predisposition of high dog- 
matics to uncritically accept authority ad- 
Vice facilitates their problem solving when 
new beliefs receive authority endorsement 
and hinders problem solving when old be- 
liefs are endorsed. Conversely, it was as- 
sumed that expert advice would be resisted, 
at least initially, by low dogmatics. Addi- 
tionally, it was expected that low dogmatics 
tum to task-relevant alternatives when re- 
ceiving endorsements of old beliefs, thereby 
facilitating their performance. When receiv- 
a endorsements of new beliefs from an au- 
a low dogmaties consider these in re- 
im onship to other alternatives, thereby 
zs: peding immediate acceptance of an oth- 
ne valid piece of information. As a con- 
nom their performance would be hin- 

" ed. These hypotheses imply that levels of 

lars interact with type of expert en- 

tsement to affect performance. 
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MeEtHoD 
Design 

The subjects were administered a series of op- 
portunities to solve the Denny Doodlebug problem 
(Rokeach, 1960) by recording their answers on a 
paper-and-pencil version of the problem situation. 
Performance on these trials was used to compute 
separate scores for analysis and synthesis. A hint, 
comprised of two alternative strategies, was pro- 
vided on each trial: an old belief and a new belief. 
The old belief was comprised of a commonly ac- 
cepted though incorrect or misleading approach to 
the problem and thereby functioned to increase 
“set.” The new belief was always a correct piece of 
information and, under typical problem-solving 
conditions, must eventually be discovered if the 
subject is to arrive at the solution. 

Subjects in one treatment were informed that a 
majority of experts endorsed the new beliefs. In a 
second treatment, the subjects were informed 
that a majority of experts endorsed the old beliefs, 
In the control condition, both alternatives were 
presented, but neither statement was said to be 
endorsed by experts. These three conditions were 
orthogonally crossed with two levels of dogmatism: 
high scorers represented closed-mindedness and 
low scorers represented open-mindedness. Thus, a 
2 X 3 factorial analysis of variance was implied. 


Subjects 


The subjects were 105 undergraduate volunteers 
from an introductory course in educational psy- 
chology at Pennsylvania State University. They 
received credit toward their course grade for par- 
ticipating in the experiment. These subjects were 
drawn from a larger pool of undergraduates who 
had taken a battery of tests, which included Form 
E of the Dogmatism scale (Rokeach, 1960), 4 
weeks before the experimental sessions. Each sub- 
ject was randomly assigned to the two experimen- 
tal conditions and the control condition. In each 
experimental condition, only the 15 high and 15 
low scorers on the Dogmatism scale were used in 
the analyses (n = 90). 


Task and Apparatus 


The task employed was the Denny Doodlebug 
problem (Rokeach, 1960). Because the subjects 
were not permitted to question or talk with the 
experimenter during the experiment, several sen- 
tences were underlined and several were added to 
the original statement in an attempt to eliminate 
questions about procedure. Each subject worked 
in an isolated booth equipped with a 2 X 2 inch 
square of plain cardboard which served as the 
problem board. A hard rubber piece representing 
the “doodlebug” and another representing the goal 
or food were placed on the board in a position that 
corresponded to the problem situation. The north, 
south, east, and west points were labeled on each 
side of the board to provide additional orientation. 
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The directions to the problem, hints, and time 
signals employed in the experimental procedures 
were transmitted from a tape recorder simultane- 
ously to headsets in each of six booths. 


Procedure 


Upon entering the experimental booth, the sub- 
jects received tape-recorded instructions that in- 
cluded a brief cover story in addition to the prob- 
lem statement. 

The problem-solving periods. The 28 minutes al- 
lowed to solve the problem were divided into seven 
segments, Each of the first six segments was 3 
minutes long. However, the subjects were allowed 
up to 10 minutes to solve the problem during the 
final (seventh) segment. Once every minute, the 
subjects were signaled to record their best answer 
to the problem on a response recording form. Each 
signal represented one trial. Since each segment 
began with a signal (i.e., with a trial), the subjects 
attempted four solutions (trials) during each 3- 
minute period, thereby resulting in a total of 24 
problem-solving trials. There were no interrup- 
tions by signals during the last 10-minute period. 
Whenever the subject had arrived at what he 
thought was a correct answer, he demonstrated his 
solution for the experimenter. If correct, the ex- 
perimenter verified it and recorded the time taken 
to arrive at the solution. If incorrect, the subject 
was requested to continue working on the problem. 

Presentation of hints. At the beginning of each 
of the problem-solving periods (2 through 6), the 
subject was given problem-relevant information re- 
garding one of the beliefs necessary to solve the 
problem. The information was in the form of al- 
ternatives, each of which was, in actuality, either 
a new or an old belief, as shown below: 


Period 2: 
(a) Joe could only move north because he 
is facing in that direction. (Old belief) 
(b) Joe can jump sidewards and backwards 
as well as forwards. (New belief) 
Period 3: 
(a) Joe was moving away from the food 
when it was placed on the board. (New belief) 
(b) Joe was moving toward it. (Old belief) 
Period 4: 
(a) Joe is in the midst of a sequence of four 
jumps. (New belief) 

(b) Joe is at the end of a sequence. (Old 

belief) 
Period 5: 

(a) Joe could have taken one jump in the 
sequence of four when the food was placed on 
the board. (New belief) 

(b) Joe could have been on his third jump. 
(Old belief) 

Period 6: 

(a) Joe's final jump could have been longer 
than his previous ones. (New belief) 

(b) Joe's final jump could be shorter than 
his previous ones. (Old belief) 


Since the subjects had little difficulty with the 
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“facing belief" in preliminary work with this version 
of the Denny Doodlebug problem, it was not in- 
cluded as one of the hints. However, the "direction 
belief" (i.e. the bug can move sidewards and back- 
wards) and one version of the “movement belief” 
(ie. the bug was moving east when the food was 
placed on the board) were retained in the first and 
second hints, respectively (Rokeach, 1960, pp. 188, 
200). A second version of the movement belief (i.e, 
the bug was in the midst of a four-jump sequence 
when he saw the food) was employed as the third 
hint (Rokeach, 1960, pp. 173, 232). The fourth and 
fifth hints were designed specifically to direct au- 
thority-oriented subjects by expert endorsement 
of new beliefs and to misdirect them by expert en- 
dorsement of old beliefs. 

The alternatives and endorsements were pre- 
sented as part of tape-recorded instructions. They 
were also printed on information cards that the 
subject read while listening to the tape. These 
cards as well as the printed directions to the 
problem were available to the subject throughout 
the problem. 7 

Treatments. In all conditions, both alternatives 
in each set of hints were presented. The experi- 
mental manipulations consisted of the endorsement 
of one of the alternatives by the majority and the 
other alternative by the minority of “psychological 
experts.” For example, in the old-belief condition, 
the subjects were told for the first hint (Period 2): 


About three-fourths of the experts who have 
studied this problem feel that if the problem 
solver assumes that Joe can only move forw: 
because he is facing in that direction, it helps 
solve the problem. The remainder feel that the 
assumption that Joe can jump sidewards or back- 
wards helps in the solution. 


In the new-belief experimental condition, the en- 
dorsements of the new belief and old belief were 
reversed. The control condition was based on the 
same sets of alternatives but without endorse 
ments. 

The problem was considered solved when the 
subject recorded the correct answer and reques i 
verification from the experimenter. At the end o 
the problem-solving period, the correct solution 
was given to those who did not find it. 


Scoring Procedure 


Several scores were computed from the ie 
obtained on each trial: evidence of disoriente at 
sponses (violations of the problem directions 
general instructions), evidence of adherence bs 
beliefs, evidence of the acceptance and use 01 | i 
beliefs (analysis score), and evidence of the inte 
gration of two or more new beliefs (syn ated 
score). The procedure for scoring can be epe 4 
by the example in Figure 1, which pos the 
problem-solving period for the belief tha prt 
doodlebug should first jump away from the wed 
In his first and third trials, the subject fo! $0, 
the old belief from the alternatives presentes e 
this period. His second attempt was & disorit 
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TRIAL 4 


TRIAL 2 


TRIAL 3 


TRIAL 4 


Te en of response-recording form and one subject/s responses to Problem- 
like tho di od 3. (The subject received & response-recording form which included six boxes 
rens; ES in the figure. Each box contained four sets of crosses and circles, The X 
the'food o e position of the doodlebug at the outset of the problem; the O represented 
qu i n each trial, the subject. attempted to connect the X to the O by drawing four 
im M The numbered lines indicate the moves recorded by one subject on each of 

s. Arrows and numbers have been supplied for the subject’s moves to aid the 


reader.) 


reg à 
N amen ee fourth trial represents a new belief, 
The the solution attempt was incorrect. 1 
mea is also evidence of synthesis in this illus- 
iou Baskintegmntipn. may be of two, three, or 
Pond eliefs. The fifth belief is included in the 
eatin Et the first four.) Thus, the third trial 
Hilfe th ‘igure 1 represents the integration of the 
ins a the doodlebug was moving sideways 
we he was in the middle of a sequence when 
Sai is placed on the board. The fourth trial 
and the 3 integration of these two new beliefs 
isse itional belief that the bug must get to 
toring of d Starting away from it. In the actual 
in whieh ud ese responses, the number of instances 
Xu Hikes, or four beliefs were integra! 
Seclvely S _Weights of one, two, or three, re- 
UU i procedure provided an overall syn- 
Solved the or each subject, whether or not he 
Problem, vi Redes 1f the subject had solved the 
credited wi nu to the twenty-fourth trial, he was 
maining Woh a new belief for each trial of the re- 
problem-solving periods and was given & 


1 


4-point synthesis score for each of the remaining 


trials. 

Tt should be noted that the four trials in the first 
problem-solving period (Period 1) were not scored 
because these trials were employed to help the 
subjects become accustomed to the routine of re- 
cording responses. Furthermore, since the subjects 
did not receive information prior to this period, 
there was no way in which the subjects’ use of old 
and new beliefs could be identified. 


RESULTS 

The frequency of new-belief responses 
was analyzed via a factorial analysis of 
variance consisting of two levels of dogma- 
tism (high and low dogmatics) and three 
treatment levels (expert endorsement of 
new beliefs, expert endorsement of old be- 
liefs, and no endorsement or control). This 
analysis yielded an F of < 1.00 (df = 2/84, 
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p > .10) for the effect due to treatments, 
and an F of < 1.00 (df = 1/84, p > .10) 
for the effect related to dogmatism. How- 
ever, the interaction of Dogmatism X 
Treatments yielded an F of 3.12 (df = 2/ 
84, p < .05). 

Parallel trends were noted in the analy- 
ses of synthesis scores. Although no signifi- 
cant main effects due to treatments were 
obtained when synthesis scores were used, 
the Treatment x Dogmatism interaction 
was again in the predicted direction, al- 
though the obtained F of 2.75 (df — 2/84) 
only approached significance (p < .07). 

"These results imply that endorsement of 
old beliefs facilitated the acquisition and 
integration of new beliefs for low dogmatics 
but inhibited problem solving for high dog- 
maties. An analysis of the differences be- 
tween the means of the high and low dog- 
matics yielded a t of 2.04 (df = 84, p < 
.05) for analysis scores and a £ of 2.06 (df 
= 84, p < .05) for synthesis scores imply- 
ing that, relative to high dogmatics, the 
performance of low dogmatics was im- 
proved when misleading information was 
attributed to an authoritative source. When 
new beliefs were endorsed, the analysis and 
synthesis scores of high dogmatics were 
greater than those of low dogmatics, but in 
neither case were the differences significant 
(p > .05). The analysis scores of high dog- 
matics in the old-belief condition were sig- 
nificantly higher than those of high dog- 
matics in the new-belief condition (¢ = 
2.56, df = 84, p < .05). Synthesis followed 
the same trend (t = 1.78, df = 84, p < .05, 
one-tailed). In contrast, low dogmatics per- 
formed more poorly in the new-belief con- 
dition than in the old-belief condition as 
measured by both the analysis and syn- 
thesis scores. However, on neither measure 
was this difference significant. 

The findings related to the effects of the 
interaction between Dogmatism x Treat- 
ments interaction obtained on both meas- 
ures of problem-solving performance pro- 
vide support for the present hypotheses. 
They clearly imply that old-belief endorse- 
ment is particularly inhibitive for high dog- 
matics and facilitative for low dogmatics. 
Furthermore, high dogmatics appear to be 
particularly susceptible to authority influ- 
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ence; they tend to be inhibited by endorse- 
ment of old beliefs and facilitated by en. 
dorsement of new beliefs. 

The subjects’ scores on the Gough-San- 
ford (1957) Flexibility—Rigidity scale were 
also available. Since these scores and those 
of the Dogmatism scale were moderately 
correlated (r = .53), a further analysis was 
made of the responses of the subjects who 
were high and low scorers (i.e., above or 
below the mean of the distribution of all 
scores) on both measures. This procedure 
yielded five subjects in each group. An 
analysis of variance of these data yielded 
no significant differences for the main ef- 
fects of treatments or dispositional levels on 
analysis and synthesis responses. However, 
the interaction of Treatments X Disposi- 
tional Levels for the analysis scores yielded 
an F of 3.98 (df = 2/24, p < .05), and the 
same interaction for synthesis scores 
yielded an F of 4.24 (df = 2/24, p < 05). 

These results are displayed graphically in 
Figure 2. This interaction is similar in all 
essential characteristics to that obtained in 
the analyses based on dogmatism scores 
alone, but is decidedly more pronounced. 
These results provide evidence that the 
Flexibility-Rigidity scale and Dogmatism 
scale may be measuring functionally simi- 
lar characteristics rather than functionally 
different ones as suggested by Rokeach 
(1960). 

Responses that were violations of the 
rules and represented the subjects’ failure to 
approach the problem within the confines 0 
the rules of the game (i.e. disoriented 1 — 
sponses) were analyzed. The results indi- 
cated that neither the main effects of dog- 
matism nor of treatments was significant 
However, the effect due to Flexibility- 
Rigidity groups yielded an F of 4.00 ( | 
= 1/84, p < .05) ; and the effect due to the 
combined characteristics of dogmatism on 
flexibility yielded an F of 9.81 (df = 1/2 
p < 01). The overall tendency was for the 
high-dogmatic—rigid group to make mi 
more disoriented responses (X = 1280) 
than the low-dogmatic-flexible group (X= 
3.80). In separate analyses, paralleling 
those described above, the interaction ® | 
Treatments x Dogmatism yielded and iy 
2.77 (df = 2.84, p < 07). 
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[I] Old Belief 
None 


New Belief 
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ANALYSIS RESPONSES 
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DOGMATIC- 
RIGID 


OPEN-MINDED 
FLEXIBLE 


199 


a 


— 
E 


SYNTHESIS RESPONSES 
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OPEN-MINDED 
FLEXIBLE 


DOGMATIC- 
RIGID 


Fra. 2. The number of analysis (new-belief) and synthesis (integrative) responses made 
by dogmatic-rigid and open-minded-flexible subjects under each of three kinds of endorse- 


ment by authority. 


The means of the disoriented scores for 
the Several groups in the experiment are 
summarized in Table 1. There it may be 
seen that in relation to the control, the ef- 
fect of expert endorsement of old beliefs 
was to reduce disorientation of low dogmat- 
P (X — 433) and to increase disorienta- 
im of high dogmatics (X = 9.53). The 
fanen obtained from the new-belief con- 
pun suggest the opposite tendency. Rela- 
ive to the control, disorientation of low 


b TABLE 1 
yey USE or EaAcH Crass or RESPONSES 
IGH- AND Low-Doematic SUBJECTS 
TO DIFFERENT TYPES OF 
EXPERT ENDORSEMENT 


Type of endorsement 


Class of Old belief Control New belief 
response 
Low | High | Low | High | Low High 
dog- | dog- | dog- | dog- | dog- dog- 
matic | matic | matc | matic | matic matic 
Disori 
piu nen 4.33| 9.53| 7.40| 7.73} 8.40| 6.60 
P OI 5.93| 4.47| 4.13] 3.07| 3.40| 3.27 
Syne 7.53) 3.53] 6.00| 7.67| 6.22| 8.53 
Sis — [16.73| 5.93 11.47 15.13 10.4015.13 


dogmaties was increased (X = 840) and 
that of high dogmatics was reduced BASD 
6.60). Comparisons between the means indi- 
cated that less “errors” were made by low 
dogmatics than by high dogmatics when old 
beliefs were endorsed (t = 2.41, df = 84, p 
< .025) and less than other low dogmatics 
when new beliefs were endorsed (t = 1.89, 
df = 84, p < .05, one-tailed). 

Of particular interest is the finding that 
high and low dogmaties in the control con- 
dition, where both new and old beliefs were 
presented but where neither was endorsed, 
did not differ on measures of analysis or 
synthesis responses. These findings indicate 
that without the influence of an external 
source low dogmaties are not superior to 
high dogmatics in either their ability to 
synthesize or analyze. 


DISCUSSION 


The results obtained in the present exper- 
iment support certain of the assumptions 
related to the problem-solving behaviors of 
high- and low-dogmatie persons when pro- 
vided information from an authoritative 


source. Low dogmatics turn to the new-be- 
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lief alternative when old beliefs are en- 
dorsed, thereby rejecting the authority's ir- 
relevant information in favor of one more 
appropriate for the requirements of the 
task. In contrast, the large number of new- 
belief responses made by high dogmatics 
when new beliefs received endorsement sug- 
gests the dependency of these subjects on an 
external source such as an authority. 
However, the orientation of low dogmat- 
ies toward task requirements and of high 
dogmaties toward an authoritative source 
may not be straightforward relationships. 
Both types, when comparing information 
against the task demands, may be influ- 
enced by authority but in different ways. 
The tendency of low dogmaties to accept 
new beliefs when old beliefs were endorsed 
could mean (a) that low dogmaties are task 
oriented and, therefore, they accepted the 
most appropriate belief regardless of influ- 
ences associated with the source of the in- 
formation, or (b) that low dogmaties post- 
pone judgment of the adequacy of a solu- 
tion based mainly on an expert’s endorse- 
ment, even though the information may be 
correct, and seek further information. The 
latter process is facilitative when authori- 
ties endorse old beliefs since the subject is 
led to consider other alternatives among 
which may be the correct one. On the other 
hand, if the authority is correct, the per- 
formance of the low dogmatic may be hin- 
dered because he is misled to consider other 
alternatives to the appropriate one. If the 
first of these interpretations were correct, 
low dogmaties would adopt new beliefs re- 
gardless of which beliefs experts endorse. In 
actuality, the findings were in accord with 
the second explanation, since low dogmatics 
accepted more new beliefs when they were 
not endorsed than when they were. 
Evidence in support of the second expla- 
nation was that endorsement of appropriate 
new beliefs had a disorienting effect on low 
dogmatics. If the person were influenced 
only by the task requirements, he would be 
expected to seize upon the new belief when 
it was provided rather than to be misled 
(that is, disoriented) by it. It is almost as if 
low dogmatics feel they have to weigh evi- 
dence from an authority most cautiously. 


CHARLES B. SCHULTZ AND FRANCIS J. DI VESTA 


When the outcome led to the correct answer 
(i.e., when old beliefs were endorsed), prob- 
lem solving was facilitated and little dis. 
orientation occurred. However, when the 
outcome led to an inappropriate alternative | 
(i.e., when new beliefs were endorsed), they 
became oriented to incorrect alternatives 
which violated the problem rules rather 
than accept the correct answer provided by 
the authority. 

Unlike its facilitating effect on low dog- 
matics, old-belief endorsement proved to be 
particularly debilitating for high dogmatics. 
The use of the disoriented response by high 
dogmaties rather than the endorsed old-be- 
lief response suggests that they did noi 
“blindly” follow expert advice but, in fact, 
were misdirected in their problem-solving 
efforts. Old-belief advice was compared 
against the task requirements, its inappro- 
priateness was recognized, but the high dog- ' 
maties refused to contradict it because it 
had received authority support. Instead of 
following the inappropriate old beliefs 
which received authority endorsement or 
entertaining nonendorsed new beliefs which 
were rejected by the authorities, high dog- 
matics chose to violate the problem rules. 
This interpretation suggests that high dog- 
matics are not uncritical followers of au- 
thority in that they can recognize the inap- 
propriateness of an expert’s position by 
comparing it against the requirements 0 
the task. This explanation is obviously con- 
gruent with the rationale presented in the 
introduction regarding the high dogmaties 
dependency on authority. à 

In the present experiment, new-belief ad- 
vice facilitated the attempts of high dog- 
matics to synthesize. One would expect that 
the provision of individual new beliefs 
would not substantially aid high dogmatió | 
who still must confront the threat posed bY 
the formation of a new-belief system during 
the synthesis stage. However, this distint- 
tion may not have been salient in the pre 
ent study. We suggest that the “miniature | 
cosmology” represented by the belief sys 
tem of the game or puzzlelike conditions o 
the Denny Doodlebug problem is not neces 
sarily analogous to the definition of the be- | 


lief system as “man’s total framework fot 
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understanding his universe [Rokeach, 1960, 
p: 35].” Accordingly, the solution of the 
Denny Doodlebug problem would not pose 
a severe threat to the existing belief system 


of high dogmaties. 
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ABILITIES AND DEVELOPMENTAL CHANGES IN 


ELABORATIVE STRATEGIES IN PAIRED- 
ASSOCIATE LEARNING OF YOUNG 
CHILDREN' 


WILLIAM A. MALLORY* 
Sonoma State Hospital, Eldridge, California 


One hundred and sixty middle- to lower-class kindergarteners and 
second graders received the Primary Mental Abilities test (PMA) and 
30-item lists of noun pairs. There was no elaboration on any item for 
subjects in the control condition. The subjects in the elaboration condi- 
tion received mixed lists of three types: auditory-elaboration items; 
visual-elaboration items; and no-elaboration items. Based on relative 
performance on auditory or visual items, elaboration subjects were 
termed “verbalizers” or “visualizers” and were given a “pure list” of 
exclusively auditory- or visual-elaboration items. Elaboration subjects 
recalled more than control subjects on no-elaboration items as well as 
on elaborated items. The elaboration second graders recalled more 
than elaboration kindergarteners, There were no grade differences for 
control subjects. On the pure list, verbalizers recalled more auditory- 
elaboration items than visualizers, and visualizers recalled more visual- 
elaboration items than verbalizers. Kindergarten verbalizers did better 
on verbal meaning than kindergarten visualizers. 


The primary purpose of the present study 
was to investigate those stable characteris- 
ties of children which allow some to learn 
more efficiently under some conditions and 
others to learn more efficiently under other 
conditions. A secondary purpose of this re- 
search was to investigate the relative effi- 
cacy of visual and verbal modes of input of 
learning materials and to assess how this 
relationship may change with age. 

'This investigation was oriented toward 
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two trends in present-day learning re- 
search: (a) the “information-processing” or 
"cognitive" viewpoint, which holds that the 
learner is an active organism, one who de- 
termines to a large extent how he will learn 
(see, e.g, Neisser, 1967; Tulving, 1968); 
and (b) the increasing realization that indi- 
vidual differences in learning are worthy of 
study in their own right, rather than as 
troublesome error variance (see, OS» 
Fleishman & Bartlett, 1969; Gagné, 1907). 
One approach to the study of individual 
modes of information processing was that H 
Frederiksen (1969, 1970). This approae 
began with the notion that any proces 
theory ought to consider sources of individ» 
ual variation as central to the theory 8n 
led one to be concerned with the interaction 
between the attributes of the learning uc 
and the characteristics of the learner. Som 
features of such a “differential proce 
theory” have been outlined and applie 
several learning tasks by Frederiksen | 
(1970) who had noted that verbal learning | 
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are, to varying degrees, functionally 
indelerminant; that is, there is more than 
one way of processing the information or 
more than one strategy that can be utilized 
to remember the items. This view holds that 
an individual can use one strategy at one 
time and another strategy at another time; 
furthermore, the primary strategy that is 
utilized may change as the task proceeds, 
that is, as the individual receives feedback 
as to the effectiveness of his initial strate- 


OM (1969) found as predicted, 
for three verbal learning tasks administered 
to college students, that performance was 
dependent upon a complex relationship be- 
tween the ability profile of the subject and 
the degree of constraint of the task, and 
that this relationship was mediated through 
use of strategies or characteristic modes 

Of information processing. A subsequent 
Mudy (Frederiksen, 1970) showed that cer- 
tain predicted features of these relation- 
ships were replicable for a different sample 
ftom the same population. Thus, there was 
‘Widence that learning style or mode of in- 
formation processing was an important de- 
- lerminant of learning performance for the 
Individual adult. This evidence bears a re- 
lonship to findings on cognitive styles 
(eg, Witkin, Dyk, Faterson, Goodenough, 
‘Karp, 1962) and problem-solving styles 
ch, 1965) in adults. Furthermore, 
reis research which indicates that stylis- 
factors may be of importance in children 
E; Kagan, Moss, & Sigel, 1963). Such 
tion ngs have clear implications for educa- 
On, particularly if they hold for children. 
Some indication that differences in modes 
formation processing may in fact effect 
m's learning performance can be 
in Rohwer's research on elaboration in 
thildren’s paired-associate learning. Rohwer 
leg, 1967, 1968, 1970b) has asserted that 


= activity engaged in by the sub- 


facilitates learning. This “subject elab- 
ration” will be defined here as any means 
JY which the subject adds to the nominal 
imulus input. 
In an attempt to influence the mode and 
Roh t of subject elaboration in children, 
Wer varied the mode of presentation of 
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the stimulus materials. For example, if the 
two members of a noun pair are presented 
as linked by a verb (resulting in a sen- 
tence), it is inferred that the child is likely 
to process these items in the context of the 
sentence. Generally, for paired-associate 
tasks, learning has been found to be facili- 
tated to the extent that the material has 
been elaborated by the experimenter. The 
degree of such facilitation, however, has 
been found to be significantly dependent 
upon such stimulus variables as form class 
(Rohwer, 1967), type of depiction (Rohwer, 
Lynch, Suzuki, & Levin, 1967), sentence 
structure (Suzuki, 1969), and syntax (Ehri 
& Rohwer, 1969). Furthermore, the amount 
of such facilitation was also different for 
groups differing in such characteristics as 
age (Rohwer, 1967), ethnicity (Rohwer, 
Ammon, Suzuki, & Levin, 1971), and IQ 
(Rohwer, 1968). 

AII presentation modes heretofore studied 
have used either an aural or a visual mode 
of augmenting the stimulus input. The child 
either sees, hears, or both sees and hears, 
the material to be learned. For example, he 
may hear the words Cow-Ball, see a picture 
of a cow next to a ball, or both of these. It 
can be argued that the assessment of the 
relative efficacy of visual and auditory 
modes of input may be important for edu- 
cation. Thus, if it is found for a certain 
learning task that auditory presentation re- 
sults in superior learning than visual pres- 
entation, then it is possible that classroom 
learning of comparable instructional mate- 
rials may be more efficient under conditions 
of auditory presentation than visual presen- 
tation. Such a separation of the effects of 
visual and auditory input modes, however, 
is no simple matter. For example, if the ma- 
terial is presented in the visual mode only, 
the child may spontaneously label and then 
orally rehearse the items (Flavell, Beach, & 
Chinsky, 1966). Conversely, if the material 
is presented in the auditory mode only, the 
child may form a visual image of the items 
(Paivio, 1970). 

An investigation designed to study the 
two general questions described above 
ought to include a consideration of age 
changes in the relative “dominance” of par- 
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ticular modes of information processing. 
Rohwer prefers to view the question of age 
changes in information processing in terms 
of “modes of storage” (visual or verbal) of 
learning materials. Rohwer (19702) hy- 
pothesized (a) that the visual mode is dom- 
inant, and (b) that the degree to which the 
visual mode is generally dominant over the 
verbal inereases with age. Some indirect 
support for the first hypothesis was found 
in data obtained by Dilley and Paivio 
(1968) which showed that pictures pro- 
duced better learning than words, and by 
Rohwer, Lynch, Levin and Suzuki's (1967) 
data which also showed that more efficient 
learning was associated more with pictures 
than with words. 

Data that may be interpreted as support- 
ing the second hypothesis were obtained by 
Rohwer (1968). Since the material used in 
Rohwer's experiment is similar to that used 
in the present research, an extended de- 
scription of Rohwer's method will be pre- 
sented. Four mixed lists of 25 noun pairs 
were administered to kindergarten, first- 
grade, and third-grade children. The lists 
were mixed with resepct to the five different 
ways in which the pairs were presented 
(presentation modes): (a) Names—nouns 
presented aurally without visual depiction; 
(b) Still—pictures of object pairs without 
aural naming; (c) Names-Still—a combina- 
tion condition with pictures of objects and 
their noun names presented aurally; (d) 
Sentence-Still—pictures of object pairs with 
a sentence containing their noun names pre- 
sented aurally; and (e) Names-Action—ac- 
tion pictures of object pairs with their noun 
names presented aurally. Each list consisted 
of five pairs of each of these five types of 
items. For each grade level, the order of 
pair types with resepet to the associated 
degree of learning performance (from least 
to most) was: (a) Names, (b) Still, (c) 
Names-Still| — (d) Sentence-Süill, — (e) 
Names-Action. The relative superiority of 
performance given pictorial items over per- 
formance given verbal items increased with 
age as predicted in Hypothesis 2. Of partic- 
ular relevance to the present study was 

Rohwer's later (1970b) observation relative 
to his 1968 data, that there appeared to be 
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reliable Subject x Presentation Mode in- 
teractions; that is, certain children ap. 
peared to benefit consistently more from 
certain modes of presentation than did 
other children. This evidence, of course, was | 
derived from lists in which the five different 
types of pairs were mixed in a single list, As 
will be discussed more fully, the use of 
mixed lists presents certain inherent, limita. 
tions upon inferring degree of elaborative 
facilitation from a given item type. i 
The above evidence suggests that the 
“optimal” mode of information processing 
for one child may not be the same as the 
optimal mode for another child. At this 
point, it is useful to consider possible age 
changes in processing modes (or strategies) 
in functionally indeterminant tasks from 
the point of view of “differential process 
theory.” Two types of optimization that 
may occur in any cognitive task have been | 
identified by Frederiksen (1970) and Gold- 
man (1970). First, the child may optimize 
within the constraints of the task by per- 
forming in the most effective manner that 
the task allows; second, the child may op- 
timize with respect to his abilities, that is, 
he may perform in the most effective man- 
ner that his abilities allow. The two sorts [i 
optimization are not mutually exclusive and : 
may even work against one another for cer- 
tain individuals on certain tasks. The nor- 
mal” optimal mode of solving a given prob- 
lem may not be in consonance with an mM- 
dividual’s pattern of abilities. E 
The present study primarily concerned 5 
self with optimization of the second type. i 
is hypothesized that the child will in » 
optimize in this sense, although there cou : 
be important individual differences, partit 
ularly age differences, in the extent ri 
which this optimization will occur. GO d 
man (1970) found for adults, with respè 
to optimization in a specially eri 
functionally indeterminate problem i 
task, that ability structure was relative! 
more important in determining choices 
“strategies” for men than for women. 4 
also found that certain characteristic’ in 
the task were relatively more importan a 
determining strategy choice for women ibe 
for men. It is further hypothesized tha 


PAIRED-ASSOCIATE LEARNING OF YOUNG CHILDREN 


child will attend more to those items con- 
forming to his strengths, abilities, or pre- 
ferred modes of information processing than 
to those items not so conforming. Thus, for 
example, it is supposed that a verbal child 
will tend to employ a strategy consisting of 
concentrating on verbally elaborated items 
more so than visually elaborated items. 
Thus, a child’s relative performance level 
on a given item type will be taken as a 


| measure of his extent of employment of 


modes of information processing associated 
with the item type as well as his ability to 


utilize processing modes associated with 


that item type. While it is felt that this 
study can provide some evidence bearing on 
this second hypothesis, the hypothesized 
link between attention and preferred modes 
of processing is a question requiring consid- 
erable future investigation. 


- Rationale for Design 


Because of the school grade differences in 
degree of benefit from different modes of 
elaboration found by Rohwer (1968), as 
well as the numerous cognitive shifts that 
take place in children between ages 5 and 7 
(summarized by White, 1965), it was de- 
cided to use kindergarteners and second 
graders as subjects in the present study. 

A mixed-list paradigm, though subject to 
certain limitations, offers an opportunity to 
Indirectly infer preferred strategy types 
from performance on each item type. Thus, 
mixed lists similar to those used by Rohwer 
(1968) were used, except that there were 
only three different item types: (a) audito- 
fe oration items, in which a picture of 
is noun pair side by side was presented 
db ng a with a „sentence linking the two 
iun. (b) visual-elaboration items, in 
Du the pair was presented in a definite 
na interaction, but the objects were 
bc only; and (c) no-elaboration items 

which the items were simply shown side 
Due and named. Based on the relative 
lis ig ites on each item type in the mixed- 
vi ondition, the subjects were classified as 

Sualizers or verbalizers. 

E as are certain limitations inherent in 
di 8 conclusions about the relative 

ciency of different item types upon Te- 
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sults obtained from mixed lists. Although 
the order of appearance of different item 
types was randomized, there could be an 
effect associated with the context within 
which an item type occurs. For example, the 
presentation of three auditorily elaborated 
items in a row followed by a single item 
with no elaboration could suggest to some 
children that they form a sentence linking 
the two nouns of the nonelaborated pair. 
This could facilitate the learning of some of 
the nonelaborated items. In this context, 
more nonelaborated items would be recalled 
than in the context of a pure list of all 
nonelaborated items. Thus, a control condi- 
tion was included in which exactly the same 
pairs were given in the same order as in the 
mixed list except that the entire list was 
presented as nonelaborated items. Also, to 
estimate the degree of stability of the domi- 
nant modes of processing inferred from rel- 
ative performance on different item types 
on the mixed list, an additional learning 
session was given which consisted of pure 
lists of visually elaborated or aurally elabo- 
rated items presented to the visualizers and 
verbalizers identified on the mixed list. 

Tf such a classification of visualizers and 
verbalizers has some validity, one might in- 
vestigate those characteristics of verbalizers 
that differentiate them from visualizers. 
Since some degree of verbal ability is likely 
required in benefitting from auditory elabo- 
ration (and similarly for spatial ability and 
visual elaboration), performance on a given 
item type might be expected to be corre- 
lated with a given measure of ability. For 
example, performance on visually elabo- 
rated items might be more highly associated 
with spatial ability than itis with verbal 
ability. As noted above, there is evidence 
from adult data that the extent of correla- 
ton of performance with specific ability 
measures is dependent on the characteristics 
of the task (Frederiksen, 1969). These rela- 
tionships, however, have not been investi- 
gated for children. 1 i 

Performance on paired-associate learning 
tasks by children has been related to a 
number of general intelligence measures: 
for example, Raven's Progressive Matrices 
(Green, 1969; Rohwer, 1968; Rohwer et al., 


206 


1971); Peabody's Picture Vocabulary Test 
(Green, 1969; Rohwer, 1967; Rohwer et al., 
1971); and Lorge-Thorndike’s IQ (Green, 
1969). Paired-associate learning under dif- 
fering elaboration conditions, however, has 
not heretofore been related to specific abil- 
ity measures in children. 

If abilities are to some extent differenti- 
ated in young children, then further infor- 
mation could be gained from administering 
specific ability measures over giving a sin- 
gle global measure of intelligence. The issue 
of the presence and degree of differentiation 
of abilities in young children is certainly 
unresolved (e.g., Guilford, 1967). However, 
in agreement with Anastasi (1970), it was 
felt that some degree of differentiation is 
present at the lower end of the age range of 
the present study (about 5-8Vy5 years), and 
that possibly some additional differentia- 
tion has taken place by the upper end of 
this range. Therefore, it was decided to ad- 
minister specific ability measures and to 
correlate these with overall performance on 
paired-associate learning as well as with 
performance on each of the three item types 
separately. 

Tn the present research, the SRA Primary 
Mental Abilities test was chosen as the sub- 
tests are based on factor scores and thus 
provide relatively uncorrelated or pure 
measures of ability. The following subtests 
of the Primary Mental Abilities test are the 
same for both the Kindergarten through 
First Grade form and the Second through 
Fourth Grades form: Verbal Meaning, Spa- 
tial Relations, Number Facility, and Per- 
ceptual Speed. . 

Recall Rohwer's (1968) finding that the 
degree of superiority under visual modes of 
presentation increased as a function of age. 
In this connection it is interesting to view 
some results obtained by Thurstone. Thur- 
stone (1955) plotted rescaled scores from 
each of the Primary Mental Abilities sub- 
tests resulting in a measure of “proportion 
of adult status” achieved over the age Tange 
from 0 to 19 years. If we consider only the 
Verbal Meaning and Spatial Relations sub- 
tests over only the age range from 5 to 84 
years, we note that Verbal Meaning in- 
creases about .21 and Spatial Relations in- 


WILLIAM A. MALLORY 


creases about .27 during these 3% years, 
Thus, the hypothesized greater increase in 
ability to benefit from visual forms of elab. 
oration relative to that from verbal elabo- 
ration appears to be paralleled by a corte- 
sponding greater increase in spatial ability 
relative to verbal ability. 


Meruop 


Subjects | 
A total of 160 middle- to lower-class children 
attending regular classes in the John Swett Unified | 

School District were used. All children attended 
either Carquinez School, Crockett, California 
(School 1), or Hillcrest School, Rodeo, California 
(School 2). All kindergarten and second-grade 
children in these two schools were considered as 
“candidates” with the exception of those students 
the teachers indicated were not native speakers of 
English. Four children were rejected for this Tea 
son although a number of children with Spanish 
surnames were retained in the sample. All candi- / 
dates from School 1 who were able to participate 
in all sessions within the allowable time limits were 
used. The number of subjects necessary to com- 
plete the sample were randomly chosen from 
School 2. Since School 2 utilized homogeneously 
grouped classes, the selection was made so that an 
equivalent proportion of subjects was chosen from 
each classroom. For example, after completion of | 
testing at School 1, 19 second-grade boys were 
needed from School 2. The three second-grade | 
classes at School 2 contained 18, 12, and 9 boys. | 
Accordingly, nine, six, and four boys were ran- 
domly selected from each of these classes, ERE 
tively. In all, 102 children were used from School | 
1 and 58 from School 2. | 
| 
( 
* 


Design 


A2 X 2 X 2 X 2 orthogonal design was Oak 
ployed. The primary factors of interest were Wes 
and elaboration. Accordingly, half of the ae 
were kindergarteners and half, second Een " 
half were assigned to the elaboration cond af 
and half of the no-elaboration or control cone 
The elaboration condition, described in ae 
detail later, consisted of presentation of the aoe 
lists; that is, the items received auditory, elal By 4 
tion, visual elaboration, or no elaboration. T 
control condition consisted of presentation 0 The 
same items but none of them with elaboration. ah 
subjects were further factorially divided s0 coral 
half were boys and half were girls; half rm 
Presentation Sequence 1 (List 1 followed Ped 
2) and half received Presentation Send 1 
(List 2 followed by List 1). Thus there We. 
unique subject-treatment combinations 17 
entire design. 4 re 

In addition, a session consisting of a list of oe 
(solely aurally elaborated or solely visually 
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rated) items was given to 40 of the children in the 
elaboration condition only. The selection of sub- 
jects for the pure list was based on performance in 


the mixed-list session. 


Materials 

Two 30-item lists of pairs of common objects 
were constructed. The same lists were used for 
both elaboration and the control conditions. For 
the elaboration condition, the lists were mixed with 
respect to the following three types of items: (a) 
sentence auditory-elaboration items (e.g., & picture 
of a rock and a bottle side by side accompanied 
by the following spoken sentence, “The ROCK 
breaks the BOTTLE"); (b) visual-locational 
items (e.g, a picture of a cow wearing a tie accom- 
panied by the spoken names of these two objects) ; 
and (c) no-elaboration items (e.g., & picture of an 
ant and a mouse side by side accompanied by the 
spoken names of these two objects). 

Pictures of each pair of objects were drawn in 
black and white on Clearprint technical paper with 
Rapidograph pen. Colored slides were photo- 
graphed from these pictures. The requirement for 
producing two 30-item lists for both study and test 
trials with the auditory- and no-elaboration slides 
the same for both elaboration and control treat- 
ments and the visual-elaboration slides different 
for the two treatments resulted in 140 unique 
slides. To present three examples and four com- 
plete trials (both study and test), 246 slides were 
to be shown to each subject. The placement of the 
246 slides in the four 80-slide trays was such that 
all tray changes would occur between trials. 

A hookup consisting of a Kodak Carousel 800 
pieton; a Kodak Carousel programmer, Model 

; and a Wollensak 3M tape recorder was used to 

tecord the experimenter's voice naming (or using 
in the short sentence) all objects as they were to 
dtes and to automatically change the slides at 
pod of a beep every 5 seconds on both study 
irn trials. Each subject heard one of the fol- 
Se g recordings: (a) List 1 followed by List 2 
LESER 1) with no sentence elaboration, (b) 
quM eres by List 1 (Sequence 2) with no 
RUSEN elaboration, (c) Sequence 1 with sentence 
Meise or (d) Sequence 2 with sentence 
ipm Tim The time sequencing of these record- 
5 described with the procedure for Session 4. 


Procedure 


Bur ices was administered in four ex- 
Verbal M. sessions as follows: In Session 1, the 
Waie eaning subtest of the appropriate grade- 
test ( ne of the SRA Primary Mental Abilities 
luratio revision) was administered. The mean 
iE m this session was approximately 25 
ir AND Session 2, the Spatial Relations and 
Abilities Speed subtests of the Primary Mental 
EA es were administered. The mean dura- 
n both ee Session was approximately 25 minutes. 

essions 1 and 2, the kindergarteners were 
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tested in squads of three of four and the second 
graders were tested in squads of four or five. 
Sessions 1 and 2 were typically given on succeeding 
school days and were in all cases given from 1 to 
4 calendar days apart. In Session 3, the Number 
Facility subtest of the Primary Mental Abilities 
test was administered. At School 1, this subtest 
was given to entire classes (ranging in size from 
14 to 33) at one time. The testing was done so 
that a monitor watched no more than five children. 
As this arrangement was not possible at School 2, 
this session was given in squads of the same size as 
in Sessions 1 and 2. The mean duration of Session 
3 was approximately 30 minutes. 

Session 4 consisted of mixed-list paired-associate 
learning. The slides were shown on a small port- 
able screen approximately 6 feet from the pro- 
jector. The subject was informed that he would 
see some pairs of common objects and that he 
was to learn them in such a way that he could 
produce the name of the second member of each 
pair when shown the first. After three practice and 
test items, which were repeated if necessary until 
the subject produced at least one correct response, 
the first regular trial commenced. 

During the 5-second interval between the ex- 
amples and the first study trial, the experimenter 
said, “Now we're going to see more pictures of two 
things together. Watch closely and listen care- 
fully.” In the 5 seconds between each study and 
test trial, the child was reminded, “Now when you 
see just one of the two pictures, tell me the one 
that went with it.” In the 15 seconds between 
Trials 1 and 2 of the first-given list as the ex- 
perimenter changed slide trays he said, “We will 
do the same thing again. Try to remember the two 
things that go together like before.” 

Toward the end of the 30 seconds between lists; 
that is, between Trial 2 of the first list and Trial 1 
of the second list, after changing trays, the child 
was told, “Now we will do the same thing with 
different pictures.” The interval between Trials 1 
and 2 of the second list was the same as that for 
the first list. All subjects thus received four com- 
plete trials. n 

Responses were given orally. The experimenter, 
who was seated next to the subject, recorded 
whether the response was correct, an intrusion 
from elsewhere on the present or previously 
presented list, or an intrusion of an unpresented 


word. 


Pure List 


Bach of the 80 subjects in the elaboration con- 
dition was assigned a T score (M = 50.0, SD = 
10.0) based on number of correct auditory-elabora- 
tion items and a T score based on number of 
correct visual-elaboration items over the four trials 
of Session 4. For each subject, the visual-elabora- 
tion T score was subtracted from the auditory- 
elaboration T score yielding a difference score. 
Subjects with positive difference scores were 
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TABLE 1 
Summary or DESIGN rog Purn-List LEARNING 
SEssion 
Grade Verbalizers Visualizers 
Kindergarten | AUD | VIS AUD | VIS 
list list list list 
Second AUD | VIS AUD | VIS 
list list list list 
Note.—n = 5 in each group. Abbreviations: 


AUD = auditory elaboration; VIS = visual 
elaboration. 


termed verbalizers and those with negative differ- 
ence scores visualizers. 

Within each grade level, the 10 highest verbal- 
izers (those with the highest positive difference 
scores) and the 10 highest visualizers were to be 
given the pure list, which consisted of an unmixed 
list; that is, either all auditory-elaboration items 
or all visual-elaboration items. The 40 subjects so 
identified were assigned factorially to either the 
pure auditory-elaboration or pure visual-elabora- 
tion condition. Thus a 2 X 2 X 2 design (depicted 
in Table 1) resulted for the pure list. 

During the administration of the pure list, how- 
ever, one of the kindergarten verbalizers and one 
of the second-grade verbalizers was absent. These 
children were replaced with the eleventh highest 
verbalizers in their respective grades. 

From the four 25-item lists provided by Rohwer 
(mentioned above), 55 items had not been used in 
Session 4. From these items, 30 were chosen which 
were amenable to both auditory and visual elabo- 
ration, The order for the two study and two test 
trials was rerandomized. The drawing, photog- 
raphy, and recording were equivalent to those of 
Session 4. 

Upon entrance to the room, the subject heard: 

Do you remember what we did last time? First 

you will see two things together. Then you will 

see just one of the two and you try to tell me 
the other one that goes with it. Now we'll try 

Some just to remember the correct way to do 

it. 

(3 practice study items with auditory elaboration 
for auditory-treatment subjects and visual elabora- 
tion for visual-treatment subjects) 

The experimenter then said: 

Like before, when you see just one picture, tell 

me the other one that goes with it. 
(3 practice test items) 

The remainder of the pure list was identical to 
that of Session 4 except that in the pure list there 
were only two complete trials. The pure list lasted 
approximately 15 minutes per subject. 


HyrornHEsks 
Strategies in Information Processing 


1. It was expected that verbalizers would 
correctly recall relatively more items on the 
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purely auditory list than would visualize. 
on the purely auditory list and that, visual. 
izers would correctly recall relatively more 
items on the purely visual list than would 
verbalizers on the purely visual list, This 
hypothesis is based on the following as. 
sumptions: (a) the experimental conditions 
affect the expected effectiveness of process- 
ing modes (strategies) ; thus, for example, a 
strategy of remembering the items in the 
context of a sentence is more effective when 
the item is presented in a sentence, than 
when the item is not so presented; (b) 
strategy preferences are relatively stable 
over time; and (c) the stability is equiva- 
lent for both tasks or equivalently function- 
ally indeterminant. 

It is true by definition that in the elabo- 
ration condition of the mixed-list learning 
session that verbalizers recall relatively 
more auditory-elaboration than visual-elab- 
oration items, and conversely. In the con- 
trol condition, the subjects should be just as 
likely to recall a relative amount of one 
item type as the other. Therefore, a predic- 
tion can be made as follows: ta 

2. It was expected that on the mixed list, 
the item-type intercorrelations, for example, 
auditory-elaboration items with visual- 
elaboration items, would be higher in the 
control condition than in the elaboration 
condition. This hypothesis assumes the fol- 
lowing: (a) the item types affect the strate- 
gies used, that is, different strategies may 
be used on different item types; (b) high 
intercorrelations reflect common strategies 
and (c) attention also affects the ora 
tion, that is, the children may deve? 
strategies of paying more attention to ie 
tain item types at the expense of the ome 
item types. | 

It ie been previously stated that x i 
ability may be expected to correlata a 
performance on verbally elaborated ite | 
and spatial ability with visually elaborate? | 
items. Therefore: adig i | 

3. It was expected that those subjec Bb 
the elaboration condition who perio | 
better on the Verbal Meaning subtest 1e d 
tive to the Spatial Relations subtest Lm 
tend to perform better on the audi d 
elaboration items relative to the visual-8/4? 
oration items, and conversely; or, Sta 


a slightly different way: Of the following 
four correlations: (a) spatial relations with 
yisual-elaboration items; (b) verbal mean- 
ing with auditory-elaboration items; (c) 
spatial relations with auditory-elaboration 
items; and (d) verbal meaning with vis- 
ual-elaboration items; a and b would be 
higher than c and d. This hypothesis as- 
sumed the following: (a) specifie ability 
tests reflect specific processes likely to occur 
in the learning tasks used here, (b) high 
correlation reflects a common process, and 
(c) different item types elicit different 
strategies. Similarly, it was expected that 
these ability, item-type correlations would 
extend to the pure list; for example, it was 
expected that performance on the Verbal 
Meaning subtest would be correlated with 
performance on the aurally elaborated pure 
list. 

4. It was expected that the Perceptual 
Speed subtest would hold a similar though 
less pronounced relationship to visual-elab- 
oration items as the Spatial Relations sub- 
test because the visual imagery involved in 
performing the Perceptual Speed test is less 
aoc than that of the Spatial Relations 
est. 


Age Differences 


It should be recognized that Hypotheses 
1-4 may be age dependent; that is, the rela- 
tionships proposed in these hypotheses 
could differ as a function of age. In addi- 
tion, the following two hypotheses are ex- 
Plicitly concerned with age differences. 

5. It was expected that second graders 
Would perform relatively better on visual- 
elaboration items in both the mixed and pure 
conditions than kindergarteners. This hy- 
Dothesis is stated because a number of find- 
Ings (eg, Rohwer, 1970a) indicated that 
older children benefit relatively more from 
visual forms of elaboration than younger 
children, 

t 6. Tt was expected that more items of all 
Fins (whether elaborated or not) would be 
called by second graders than by kinder- 
nea This hypothesis is based on the 
1967 of the previous findings (e.g., Rohwer, 
Es is in Grades kindergarten, 1, and 3, 
a “associate learning performance Was 
increasing function of grade level. 


Å- EE eee eee 
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RESULTS 


Mixed-List Learning Performance 


The number of items correctly recalled in 
the mixed-list session were plotted in two 
different ways: (a) by each list and trial 
separately combined over grades (see Fig- 
ure 1); and (b) by each grade separately 
combined over lists and trials (see Figure 
2). These figures suggest a number of 
trends: (a) the auditory-elaboration items 
(those items receiving auditory elaboration 
in the elaboration condition) presented 
without elaboration are apparently intrinsi- 
cally easier than are either the visual-elab- 
oration items (those items receiving visual 
elaboration in the elaboration condition) or 
the no-elaboration items (those items re- 
ceiving no elaboration in either condition) 
(Figure 1); (b) for second graders, no-elab- 
oration items within an elaboration list were 
recalled more than no-elaboration items 
within a control list (Figure 2), and (c) 
second graders appeared to benefit more 
from both types of elaboration than did 
kindergarteners. This observation is based 
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Fic, 1, Number of items correctly recalled in 
mixed-list learning session by item type, elabora- 
tion condition, lists, and trials. 
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Fic. 2. Number of items correctly recalled in 
mixed-list learning session by grade and elabora- 
tion condition for each item type summed over 
lists and trials. 
on the fact that for both visual-elaboration 
and auditory-elaboration items, in the elab- 
oration condition the second graders re- 
called more than did the kindergarteners, 
while in the control condition, the children 
in the two grades recalled essentially the 
same number of items (Figure 2). 

Analysis with dependent variables sepa- 
rated. For the purpose of assessing the ef- 
fects upon learning performance of the in- 
dependent variables (grade, elaboration, 
sex, and sequence), a 2 X 2 X 2 x 2 multi- 
variate analysis of variance was conducted. 
Since the use of repeated measures as fac- 
tors in an analysis of variance is likely to 
be inappropriate for data such as these 
(see, e.g., Bock, 1963), each of the three 
item types on each of the two trials of each 
of the two lists was considered a single de- 
pendent variable. As a result, there were 12 
dependent variables. For example, List 1 
TL1 AUD refers to those auditory-elabo- 
ration items that were given on the first 

trial of the first-given list. The means and 
F ratios are presented in Tables 2 and 3. 
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Although neither the multivariate main 
effect for grade or sex reached Significance 
at the .05 level, 5 of the 12 univariate Fs for 
sex were significant at the .05 level, In fact, 
on all 12 measures, boys correctly recalled 
more items than girls. The highly signifi- 
cant multivariate elaboration effect reflects 
the fact that the children recalled many 
more elaborated items. Also, children in the 


elaboration condition recalled more no- . 


elaboration items than did children in the 
control condition on both trials of both lists, 
and on Trial 2 of the first list, the difference 
was significant. The significant multivariate 
sequence effect was such that when consid- 
ering only second-given lists, List 1 was 
learned better than List 2, although this 
list-sequence difference was absent for first- 
given lists. This result is commented upon 
more fully in the Discussion section. 
Although none of the multivariate inter- 
actions reached significance at the .05 level, 
9 of the 132 univariate interactions were 
signifieant at this level. However, only 
those univariate interactions reaching sig- 
nificance at the .01 level are interpreted 
here. On List 1 TL1 AUD and List 1 TL1 
VIS, the Grade X Sequence effect, indicated 
that for these items second graders did rela- 
tively better on Sequence 2 than Sequence 
1, while the reverse was true for kindergar- 
teners. The Grade X Elaboration effect on 
List 1 TL1 VIS was such that second grad- 
ers benefited more than kindergartenets 
from visual elaboration on the initial trial. 
Pooled dependent variables. To deter- 
mine whether the results obtained mwy 
would differ if item types or lists and trials 
were pooled, the above analysis was € 
peated with each of the following variations 
in the composition of the dependent varia- 
bles: (a) by each item type summed p 
lists and trials and (b) by lists and i 
summed over item types. Combining pu 
lists and trials revealed second gredi 3 
called significantly more visual ose 
items than kindergarteners (F = 4.05, D 
3/142, p < .05). This view of the data E i: 
revealed that on each of the three tori 
types and on three of the four trials, irl. 
performed significantly better than TE 
Each of these univariate F tests showe 
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TABLE 2 
Mean NUMBER OF [rems CORRECTLY RECALLED IN THE Mrxep-List LEARNING 
Session BY EacH List, TRIAL, IrEM-TyPE COMBINATION 
List 1 
Trial 1 Trial 2 
88 
2 i É $ AUD VIS NON AUD VIS NON Row mean 
Us] g 2 
& |8°| 3 g | Mc | Mp | Mc | Mp | Mc | Mp | Mc | Mp | Mc | Mp | Mc | Mp | Mc | up 
K/C|M| 1 |48 2.3 2.1 6.6 4.2 3.8 
K/C|M| 2 |41 1.4 8 6.1 2.9 2.7 
K/C|F| 1 |3.2 I5 1.2 5.8 24 2.5 
K|C|F 2 |3.7|40]| .9|1.5|1.0|1.3] 5.7 | 5.9 | 2.8 | 3.1 | 8.1 | 3.0 
K|E|M| 1 |6.6 4.2 1.7 7.6 1.2 4.9 
K|E|M|2 |54 2.7 1.0 7.1 5.5 3.4 
K|E|F |1|53 3.7 1.1 6.7 5.9 3.2 
K|E|F | 2 |51|5.6|2.8|3.4| 1.1| 2| 7.7 | 7.3] 6.7 | 6.3 | 2.6 | 8.5 
2 |C|M|1]|42 1.5 1.8 6.2 3.7 4.2 
2|C|M|2 |40 1.0 "e 6.2 2.9 2.3 
BEONE | 1 | 2.0 1.0 8 4.7 2.4 2.1 
2|C|F|2]|3.6|3.4|12|1.2| .9|1.1|6.3|5.8|2.5| 2.9| 8.5 | 3.0 
2 |E|M|1 |5.4 3.9 1.0 7.6 6.2 4.2 
2|E|IM|2 |7.5 5.9 1.7 8.5 8.9 5.4 
2 |EIF|1]|48 357 1.1 7.4 6.8 3.2 
2 |E|F|2.]|6.8/6.1|4.7/ 46|1.3| .3|8.4|8.0| 7.7 | 7.4 | 4.6 | 4.4 
List2 
K C|M|1 |43 2.4 2.6 6.6 4.2 4.0 3.99 
K|C|M|2 |4.8 2.8 2.8 6.1 4.5 4.1 3.51 
K C|F|1]3.3 1.1 9 6.0 2.8 2.1 2.68 
K/C|F | 2 |5.1]44|2.0] 2.0 | 2.6 | 2.1] 6.6 | 6.3 | 4.5 | 4.0 | 4.2 | 3.6 | 3.43 | 3.40 
K|E|IM|1]|641 4.1 1.8 7.9 71 3.9 5.26 
K/E/M]/ 2 |5.5 4.1 2.1 7.2 6.1 44 4.54 
K|E|F|1]|46 3.0 1.6 6.9 5.4 3.6 4.25 
K|EIF|2 |55|5.7|3.8/3.8| 1.6 | L8| 7.1 | 7.3 | 7.6 | 6.6 | 4.3 | 4.0 | 4.66 | 4.67 
2 |C|M|1 46 2.0 1.1 6.8 3.7 3.9 3.64 
2 |c|M|2 |47 3.0 2.5 6.3 5.0 4.6 3.64 
2 |C|F|1]32 .8 7 5.7 2.4 2.8 2.98 
2 |Cc F|[2 |40|41] 2.4] 2.0] 2.4| 1.7] 6.8 | 6.4 | 4.4 | 3.9 | 5.5 | 4.2 | 8.62 | 8.82 
2 |E|M| 1 |6.1 3.7 1.5 8.4 7.2 3.4 4.88 
2 E|M|2]78 6.0 3.8 8.8 8.9 6.0 6.62 
2 EJF] 1 |55 4.8 1.3 8.1 7.9 4.2 4.90 
2 E|[F|2 64/64|48|48|2:2] 2.2|8.0 | 8.8 | 7.2 | .8 | 5.0 | 4.6 | 5.59 | 5.49 
Note.—Me refers to cell means; Mp refers to the means pooled over sex and sequence. Abbrevia- 
tions: K = kindergarten; C = c et E = elaboration; AUD = auditory elaboration; VIS = visual 


elaboration; NON = no elaboration. 


wee at close to the .02 level. The mul- 
opa Grade X Elaboration effect (F — 
«35 df = 3/142, p < 04) indicated that 
f Ond graders benefited more significantly 
tom elaboration of both iypes than did 
isis crEArleners. Inspection of the univar- 
nifies x revealed that this effect was sig- 
tion int at the .01 level for visual-elabora- 

and the .05 level for auditory-elabora- 


tion items. The fact that it was significant 
for visual-elaboration items, offers clear 
support for Hypothesis 5 that second grad- 
ers would benefit more from visual elabora- 
tion. It is also important to note, however, 
that second graders benefited more from au- 
ditory elaboration as well. 

Item-type intercorrelations. The Pearson 
product-moment item-type intercorrelations 
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TABLE 3 
Summary or MULTIVARIATE ANALYSIS OF VARIANCE FOR MIXED-LisT 
LEARNING SESSIONS 
Univariate Fs Multi- 
y" 
List 1 List 2 wha) 
Source a 
Trial 1 Trial 2 Trial 1 Trial 2 
Ply 
AUD VIS |NON| AUD VIS |NON| AUD | VIS |NON|AUD| VIS [NONI 

Grade (A) 0.0| 3.2/|0.0| 1.0 1.8|1.8| 1.6 | 3.8 | 0.0| 3.2| 2.4| 2.8| 1.3/.298 
Elaboration (B) 46.0 | 108.1 | 0.0 | 32.3 | 141.8 | 8.8 | 80.8 | 58.0 | .2/20.6/80.2] 1.6/21.9].001 
Sex (C) 86| 3.2/3.3) 23) 2.7|6.2| 6.7| 4.2|5.0| 1.3] 2.4. .8 1.4.1 
Sequence (D) 2.3 211.6] 2.5 .2|0.0| 6.3} 7.3 |16.5| 0.0] 6.7/12.8) 4.1).001 
AXB 2.6 8.8] .2| 1.6 3.8 | 1.8 | 4.4] 2.6 | 3.0] 2.4) 3.6) 0.0) 1.5).148 
AXC 0.0) 0.0/0.0} 0.0} 0.0] .1 6 .2|0.0| 0.0) .2| .4| 41.961 
AXD 7.7 | 10.0] 2.8] 1.6] 3.2|3.7 1| 1.9) 5.5) .3| .1| 1,4} 1.5].128 
BXC E 0.0} .5]| 1.1 1.2/1.0 1 .6| .1| .3| .5| .2) 1.2),257 
BXD 3 1.2 | 2.4 d 2.7) .4| 0.0| 0.0] .3| .1 1.2) .1| 1.1.97 
CXD 2.8 .2|2.0| 2.8 1.6/6.2] 1.2] 0.0 | .1|1.5| 1.0] .7| .8|.675 
AXBXC E 1.2] .2]| 0.0 MI TE -l .4 | 1.0) 0.0} 0.0) .2^. .6/.811 
AXBXD 23) 48| .2| 0.0] 2.7/3.7] 2.3| 0.0| .4| 0.0) .3| .1| 1.71.07 
AXCXD 0.0 .5 | 1.0 Bt 3.8| .1| 1.4] 2.1] 2.0) .1| 4.8 .5| 1.5].149 
BXCXD 7 .6 | 1.3 a +5 | 2.5 +3] 1.7] 4.1) .8| .2/ 3.9) .6).84 
AXBXOCXD E 8] .4 7 1.8] .9 5] 1.8 .1| .6|2.9| .5| .5.92 


Note.—For univariate Fs: df = 1/144; F = 3.91; p = 


Abbreviations: AUD = auditary elaboration; VIS 
were computed for each elaboration condi- 
tion separately and are presented in Table 
4. The differences were all in the direction 
predicted by Hypothesis 2 that the correla- 
tions would be higher for the children in the 
control condition although only one of the 
three differences, the correlation of audi- 
tory-elaboration items with no-elaboration 


TABLE 4 


ConnELATIONS or Irem Types IN Mrxep-List 
LEARNING SESSIONS 


Auditory | Visual 
Item type bora” | elabora- | einko- 
tion tion ration 


Elaboration subjects 

AUS ines SRE GALS 
Auditory elaboration — 
.695 


Visual elaboration en 

No elaboration .627 .151 xs 
witch uri do. 

Control subjects 

Dii MLN ss E ED 

Auditory elaboration — 

Visual elaboration -737 -= 

No elaboration «795, .791 X 


05; F = 6.82; p = .01; F = 8.15; ps 005. 
= visual elaboration; NON = no elaboration. 


items, reached significance (t = 2.16, df = 
78, p < .05). 


Pure-List Learning Performance 


The number of items correctly recalled in 
the pure-list learning session is plotted by 
subject type and presentation mode in Fig- 
ure 3. To assess the effects upon learnin 
scores of grade, subject type (verbalizer ot 
visualizer), and presentation mode (aut 
or visual), a 2 x 2 x 2 multivariate analy- 
sis of variance was performed. The two tri 
of this learning session comprised the tW0 
dependent variables. The F ratios are pre 
sented in Table 5. The second graders r 
called significantly more items than ; 
kindergarteners. This finding is consisten 
with that of the mixed-list condition, where 
it was found that second graders benefi : 
more from elaboration than kindergarier 
ers. The marginally significant mu 
iate presentation mode main effect jin ed 
indicator that the same items were ero 
better under sentence (aural) elabora 
than locational (visual) elaboration. 
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Fro. 8. Number of items correct in pure-list 
Po Session by subject type and elaboration 
lode, 


significant three-way interaction (Grade X 
Subject Type X Presentation Mode) for 
Trial 2 only, suggests that for the second 
graders, the prediction of Hypothesis 1, 
that visualizers would benefit relatively 
more from visual elaboration, was con- 
firmed. A series of post hoc comparisons 
Was made for each subgroup of visualizers 


TABLE 5 
nnm or MULTIVARIATE ANALYSIS OF 
ARIANCE FOR PuRE-LisT LEARNING 
SESSION 


F ratios 


Source 
Trial | Multi- 


Trial 
1 2 variate 


Grade (A) 

13.52***/13.04**7.69** 
piblect type (B) 0.00 | .01 |0.00 
j een mode (C) 4.87 | 3.69 |2.52 
AXG 54 1.78 | .88 
BXG 1.09 3.21 Lo 

2. 1. 

AXBXC T n^ 2.09 


Note—For multivari 
RUSSE tivariate Fs, df = 2/31; for 
Univariate ee df = 1, /32. 
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and verbalizers who were in the same grade 
and received the same presentation mode. 
The only F ratio approaching significance 
was for the comparison involving second 
graders who received visual elaboration on 
Trial 2: the visualizers recalled more items 
than the verbalizers (F = 3.72, df = 1/8, p 
= 09). This finding is consistent with the 
interpretation of the above three-way inter- 
action. 


Relationship between Abilities and 
Learning Performance 


The prediction of Hypothesis 3, that abil- 
ities scores would be correlated with learn- 
ing performance on corresponding item 
types, was tested for both the mixed- and 
pure-list sessions. 

Mized-list session. The correlations be- 
tween Primary Mental Abilities test scores 
and number correct on each item type in 
the mixed-list session were computed for 
each elaboration condition and by each 
grade separately. These results, (see Table 
6) indicated the following trends: (a) the 
relationship between abilities and paired- 
associate learning was greater for kinder- 
garteners than for second graders; (b) for 
the kindergarteners a certain degree of ver- 
bal ability (and to a lesser extent, numeri- 
cal ability) was associated with benefit 
from both forms of elaboration, and this 
carried over to the no-elaboration items 
within an elaboration list; (c) for second 
graders, spatial ability was associated with 
benefit. from elaboration, particularly vis- 
ual; (d) the relevant abilities (verbal for 
kindergarteners and spatial for second 
graders) were more highly correlated with 
recall of elaborated items than with recall 
of nonelaborated items. 

The significance of the difference between 
the correlation observed in the elaboration 
condition with that in the control condition 
was tested for each ability, item-type com- 
pination. The only such difference reaching 
significance was for the correlation of ver- 
bal meaning with auditory-elaboration 
items in kindergarteners. The correlation 
was higher in the elaboration than the con- 
trol condition (t = 2.25, df = 38, p < 05). 

Pure-list session. One further multivar- 
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TABLE 6 
INTERCORRELATIONS OF PRIMARY MENTAL 
ABILITIES AND ITEM Tyres IN MIXED- 
List LEARNING SESSIONS 


Kin- 
eo der- di Grade 2 
Amer | ge | ee [ont Get 
con- | orati 
trol 
Verbal with P-A 162 | .596**—.060| .106 
AUD 
Verbal with P-A — |.186 | .484**| .079| .155 
VIS 
Verbal with P-A .807*| .487**| .090| .153 
NON 
Spatial with P-A .030 |— .032 .022| .234 
AUD 
Spatial with P-A — |.126 |—.088 .230| .388* 
VIS 
Spatial with P-A — |.121 |— .098 .073| .265 
NON 
Number with P-A |.134 | .432**|— .060|— .014 
AUD 
Number with P-A |.122 | .365*| .152| .006 
v 
Number with P-A |.200 | .269 +132) .049 
NON 
Perceptual speed |.029 | .191 |—.090| .040 
with P-A AUD 
Perceptual speed .038 | .166 |—.038| .089 
with P-A VIS 
Perceptual speed |.221| .084 |—.174| .026 
with P-A NON 


————Éueiem [o 
Note.—Abbreviations: P-A = paired associate; 
AUD = auditory elaboration; VIS = visual 
elaboration; NON = no elaboration. 
^n = 40 in each group. 
*p < 05. 
"p < 01. 


iate analysis of variance was conducted for 
those kindergarteners and second graders 
(separately as the Primary Mental Abili- 
ties test scores are not comparable for the 
two grades) who participated in the pure- 
list session. Subject type and sex were the 
independent variables. None of the multi- 
variate Fs were significant. The only signif- 
icant univariate effect, subject type for kin- 
dergarteners (F = 481, df = 1/16, p < 
.05), was such that kindergarten verbalizers 
performed better on the Verbal Meaning 
test than did kindergarten visualizers. This 
finding supported that portion of Hypothe- 
sis 3 which predicted that verbal ability 
would be associated with benefit from audi- 
tory elaboration. 
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Discussion 


This study was concerned with investi. 
gating the degree to which characteristiy 
modes of information processing are present, 
in young children and the relationship of 
these modes to abilities. A further purpose 
was to investigate the relative efficacy of 
visual and verbal modes of information 
processing through the technique of admin- 
istering different types of elaboration in' 
paired-associate learning: auditory, visual, 
or no elaboration at all. Hypotheses about 
strategies in information processing con- 
cerned: (a) the stability of the classification 
of verbalizers and visualizers as identified 
on a mixed list and tested on a pure list; 
(b) the intercorrelations of different item 
types as found under no elaboration; (c) 
the relationship of specific Primary Mental 
Abilities test abilities (e.g., verbal meaning, 
or spatial relations) to corresponding item 
types (e.g., auditory or visual elaboration), 
and (d) possible relationships involving 
perceptual speed that are similar to those 
involving spatial relations. 

Hypotheses about age differences stated 
that: (e) second graders would benefit rela- 
tively more from visual elaboration than 
kindergarteners, and (f) second graders | 
would recall more items of all types 
whether elaborated or not. 


Strategies in Information Processing 


The learning performance in the pure-lst 
session by the verbalizers and visualizets 
identified in the mixed-list session suggested 
that some children are relatively more + 
tuned to auditory elaboration and OUS 
visual elaboration, and that this is a Te 
tively stable phenomenon. The observed e 
fect, of course, was not overly de i 
when compared to the effects of presen A 
tion conditions; for example, presence 9 
absence of elaboration. Indeed, the rei 
place in the pure-list session where chil tly 
of one process type recalled significan à 
more than those of the other type gh | 
given presentation condition was for sO 
graders under visual elaboration on ieee 
visualizers recalled more than verbali? | 


F T 
On the other hand, with five subjects P^ 
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cell, even one statistically significant differ- 
ence is noteworthy. To be sure, all other 
differences were in the predicted direction. 

Another bit of convincing evidence for 
the presence of individual differences in 
modes of information processing was found 
in the mixed-list condition. The item-type 
intercorrelations were higher under condi- 
tions of no elaboration than under condi- 
tions of elaboration, significantly so for the 
correlation of auditory-elaboration with 
no-elaboration items. Thus, the elaboration 
condition served to differentiate the chil- 
dren in terms of relative level of perform- 
ance on each item type. 

The highly significant elaboration effect 
is, of course, consistent with numerous find- 
ings by Rohwer and his associates that the 
presentation of elaborated material facili- 


` tates learning. The fact that second graders 


in the elaboration condition recalled more 


. items with no elaboration than those in the 


no-elaboration condition, however, is evi- 


- dence that the context of the items is of ex- 


treme importance. Thus, an estimate of the 
difficulty of a given item type based on only 
data from a mixed list (e.g., Rohwer, 1968) 
may be misleading. There is some evidence 
that very young children can form linking 
sentences (Bean & Rohwer, 1969) and con- 
jure up images (Paivio, 1970) when in- 
structed to do so, and that this is facilita- 
tive of learning. It should be considered 
that in the context of “easier” (elaborated) 
items, more attention could be given to the 
No-elaboration items, particularly on the 
Second trial. On the other hand if the child 
ina Teceive some feedback as to which 
un are the most profitable to give atten- 
En to, an argument that the child attends 
p to the elaborated items becomes just 
s convincing. In view of the above, it is 
oa Suggested that second graders in the 
hea study who received elaboration, 
with 18, learned lists including pairs linked 
ios a verb or depicted in visual interac- 
qs may have developed strategies on no- 
es items of forming their own sen- 
tins and/or imagining the items in a par- 
dem T Spatial arrangement. Since the supe- 
qus Y of performance on no-elaboration 

S was primarily confined to the second 


215 


exposure, it may take one trial to establish 
the effects of elaboration upon strategies. 


Age Differences 

It was shown in the present study that 
the second graders benefited more from 
visual elaboration than kindergarteners and 
that this difference was associated with in- 
creasing spatial ability. This finding pro- 
vides further confirmation of Rohwer’s 
(1970a) hypothesis that the dominance of 
visual over verbal modes of information 
processing increases with age. The finding 
that the second graders also benefited sig- 
nificantly more from auditory elaboration 
was somewhat unexpected. The greater gen- 
eral benefit from elaboration by the second 
graders coupled with the absence of grade 
differences in the control condition appears 
to be at odds with some of Rohwer’s earlier 
findings. Rohwer (1967) found for kinder- 
garteners, first graders, and third graders, 
that the older children recalled more items 
under all presentation conditions but bene- 
fited no more than younger children from 
elaboration. More recently, however, Roh- 
wer and his associates (e.g., Bean & Roh- 
wer, 1969) have also obtained data which is 
consonant with the present study. The older 
children appear to benefit more than the 
younger children from verbal as well as vis- 
ual elaboration. It is interesting to note 
that the Rohwer (1967) study which found 
no grade differences in benefit from elabora- 
tion used upper-middle-class white children 
and lower-class black children as subjects; 
while the Bean and Rohwer (1969) study 
and the present study which did find such 
grade differences used primarily lower- to 
lower-middle-class white children. 

Tt is therefore concluded that during the 
primary years the ability to benefit from 
presented visual elaboration increases with 
grade level and there is some evidence that 
the ability to benefit from presented verbal 
elaboration also increases with grade, al- 
though the latter type could depend upon 
social class. 

Relationship of Item Types to Abilities and 
Intellectual Development 

The relationships between scores on the 

Primary Mental Abilities test and elabo- 
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rated paired-associate learning are not en- 
tirely clear. Recall that the results indicated 
that verbal meaning was highly correlated 
with auditory-elaboration items for kinder- 
garteners and spatial relations with visual- 
elaboration items for second graders. Char- 
acterization of the developmental process 
will be attempted in the following oversim- 
plified manner: Let us say that there is a 
certain plateau of verbal ability which is 
necessary for the efficient learning of elabo- 
rated items (verbal or visual) and that there 
is a certain plateau of spatial ability for vis- 
ually elaborated items. Kindergarteners are 
jn a transitional stage (some have reached 
the plateau and some have not) with respect 
to verbal ability, and second graders are in 
a transitional stage with respect to spatial 
ability. Presumably all or nearly all of the 
second graders have attained the prerequi- 
site verbal ability and thus, spatial ability 
is the primary correlate of elaborated learn- 
ing for second graders. 

The high correlation of number facility 
with performance on elaborated items for 
kindergarteners is also of interest. The 
Number Facility subtest, which was also 
highly correlated with verbal meaning for 
kindergarteners, is a test which seems to 
require verbal reasoning on the part of the 
child. It has been contended that the age 
range from 5 to 7 is one where marked 
changes take place in the amount and type 
of thinking in which the child engages (e.g., 
Bruner, Olver, Greenfield, et al., 1966). Pos- 
sibly, then, similar verbal reasoning proc- 
esses are involved for the kindergarteners in 
performing on the Verbal Meaning and 
Number Facility subtests, and in benefit- 
ing from elaborated presentation of learn- 
ing materials. 


Sex and Sequence Differences 


The presence of consistently superior 
learning performance by the boys is not 
readily explained. While sex differences in 
children’s paired-associate learning do not 
appear to have been previously reported, 
Rohwer (personal communication, 1970) 
has indicated that a recent study (Rohwer 
et al., 1971) showed a tendency for greater 
recall by boys than girls. In the present 
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study, the fact that the experimenter was 
male could partially account for this find. | 
ing. 
The significant sequence effect is ex. | 
tremely curious, especially when it is con- | 
sidered that the effect was nearly identical 
for both sexes and both elaboration condi- 
tions, although it was more prevalent for 
second graders than for kindergarteners, 
List 1 was learned better only when it ap- 
peared second; that is, after List 2 had been 
presented. No convincing reason for the ef- 
fect presents itself. | 
Tt was concluded that this study provided 
some evidence that "processing types" or 
stable characteristic modes of information 
processing do exist in young children. Such 
a finding was consistent with the line of 
research which has found that learning per- 
formance is dependent not only upon the 
characteristics (e.g., abilities) of the learner | 
and the characteristics of the task, but upon | 
the interaction of the two. It has also been 
suggested by Frederiksen (1969) that an | 
important mediator between the character- 
istics of the learner and those of the task 


are “cognitive strategies” or characteristic 
modes of information processing. The find- 
ings of the present study, that this is appat- 
ently also true for young children, are par- 
ticularly striking in view of the fact that 
the abilities of young children have been 
found to be much less differentiated than 
those of adults. 

In addition to a classification such 85 
verbalizers and visualizers, there are likely 
many more untapped dimensions within in- 
dividual children that are important delet- 
minants of how the individual will go about 
learning, which materials are best suited 
the individual, and how the individual 
should be instructed. A line of investigation 
attempting to discover such dimensions as 
obvious utility for educational practice LÀ 
well as for psychological theory. 


REFERENCES | 


AwasmASI, A. On the formation of paycholoie 
traits. Paper presented at the first annual J ersitY 
Choate Tryon Memorial Lecture, Univ 
of California, Berkeley, 1970. 

Bray, J. P, & Ronwas, W. D, Je. A dew 
mental study of facilitation and interfere 


PAIRED-ASSOCIATE LEARNING OF YOUNG CHILDREN 


children's paired-associate learning. Paper pre- 
sented at the meeting of the American Educa- 
tional Research Association, Los Angeles, March 
1969. 

Bock, R. D. Multivariate analysis of repeated 
measurements. In C. W. Harris (Ed.), Problems 
in measuring change. Madison: University of 
Wisconsin Press, 1963. 

Bruner, J. S, Over, R. R., Greenview, P. M., 
et al. Studies in cognitive growth. New York: 
Wiley, 1966. 

Davison, R. E., & Apaws, J. F. Verbal and 
imagery processes in children’s paired-associate 
learning. Journal of Experimental Child Psy- 
chology, 1970, 9, 429-435. 

Duwy, M. G., & Pario, A. Pictures and words as 
stimulus and response items in paired-associate 
learning of young children. Journal of Ezperi- 
mental Child Psychology, 1968, 6, 231-240. 

Emni, L. C., & Rog wen, W. D., Jg. Verb facilita- 
tion of paired-associate learning as a function of 
syntactic and semantic relations. Journal of 
Verbal Learning and Verbal Behavior, 1969, 8, 
773-781, 

FravgLL, J. H., Beacu, D. R., & Cumsxy, J. M. 
Spontaneous verbal rehearsal in a memory task 
as a function of age. Child Development, 1966, 
37, 283-289. 

FLEISHMAN, E. A., & Bartuert, C. J. Human abili- 
ties, Annual Review of Psychology, 1969, 20, 
849-380. 

Trepentxsen, C. H. Abilities transfer and informa- 
tion retrieval in verbal learning. Multivariate 
Behavioral Research Monographs, 1969 (Whole 
No. 69-2). 

FuepznrksEN, C. H. Functional indeterminacy and 
Cognitive processes in learning performance. 
P Aper presented at the Western Psychological 
Association, Los Angeles, April 1970. 

Fnexcn, J. W, The relationship of problem-solving 
styles to the factor composition of tests. Edu- 
md and Psychological Measurement, 1965, 

9-28, 

Gian, R. M. (Ed.), Learning and individual differ- 

neon. Columbus, Ohio: Merrill, 1967. 

Mr R. D. Parameters in the choice of cogni- 
We Strategies. Unpublished doctoral disserta- 

Gi on. University of California, Berkeley, 1970. 
imis R. B. SES differences on learning and abil- 
sk tests in black children. Unpublished master's 
Lain University of California, Berkeley, 1969. 

N ‘ord, J. P. The nature of human intelligence. 

vid York: McGraw-Hill, 1967. 

tel io, Moss, H. A., & Srat, I. E. Psychologi- 

1 p Snificance of styles of conceptualization. In 

we. Wright & J. Kagan (Eds.), Basic cognitive 
cesses in children. Monographs of the Society 


(No sesearch in Child Development, 1963, 28 


Kæ, p. W., & Romwzm, W. D., Jn. Paired-asso- 


pes learning efficiency as a function of response 
€ and elaboration, Paper presented at the 


217 


annual meeting of the American Educational 
Research Association, Minneapolis, March 1970. 
Neisser, U. Cognitive psychology. New York: 
Appleton-Century-Crofts, 1967. 
Parvio, A. On the functional significance of 
poet: Psychological Bulletin, 1970, 73, 385- 


Rouwer, W. D., Je. Social class differences in the 
role of linguistic structures in paired-associate 
learning: Elaboration and learning proficiency. 
(Final Rep. on USOE Basic Resch. Proj. No. 
5-0605, Contract No. OH 6-10-273) Washington, 
D. C.: United States Office of Education, 1967. 

Ronwzn, W. D., Jg. Socioeconomic status, intelli- 
gence and learning proficiency in children. Paper 
presented at the meeting of the American Psy- 
chological Association, San Francisco, September 
1968. 

Roxuwer, W. D., Je. Images and pictures in chil- 
dren’s learning. Psychological Bulletin, 1970, 73, 
393-403. (a) 

Rouwer, W. D., Jz. Mental elaboration and pro- 
ficient learning. In J. P. Hill (Ed.), Minnesota 
symposia on child psychology (Vol. 4) Minne- 
apolis: University of Minnesota Press, 1970. (b) 

Rouwer, W. D., Je, Ammon, M. S. Suzuxi, N., & 
Levin, J. R. Populations differences and learning 
proficiency. Journal of Educational Psychology, 
1971, 62, 1-14. 

Rouwer, W. D. Jg, LwcH, S., Levin, J. R., & 
Suzuxr, N. Pictorial and verbal factors in the 
efficient learning of paired-associates. Journal of 
Educational Psychology, 1967, 58, 278-284. 

Rouwer, W. D., Jg, Lyncu, S. Suzuxr N. & 
Levin, J. R. Verbal and pictorial facilitation of 
paired-associate learning. Journal of Experimen- 
tal Child Psychology, 1967, 5, 294-302. 

Suzuxr N. Noun-pair learning in children and 
adults: Deep structure and retrieval time. Un- 
published doctoral dissertation, University of 
California, Berkeley, 1969. 

Tuourstons, L. L. The differential growth of men- 
tal abilities. Chapel Hill, N. C.: University of 
North Carolina, Psychometric Laboratory, 1955, 

Tuourstons, L. L., & THurstons, T. G. SRA Pri- 
mary Mental Abilities Technical Supplement: 
Chicago: Science Research Associates, 1954. 

Tunvina, E. Theoretical issues in free recall, In 
T. R. Dixon & D. L. Horton (Eds.), Verbal be- 
havior and general behavior theory. Englewood 
Cliffs, N. J.: Prentice-Hall, 1968. 

Warme, S. H. Evidence for a hierarchical arrange- 
ment of learning processes. In L. P. Lipsitt & 
C. C. Spiker (Eds.), Advances in child develop- 
ment and behavior (Vol. 2) New York: Aca- 
demic Press, 1965. 

Wire, H. A, Dyx, R. B, Farmnsoy, H. F. 
Goopzxovan, D. R., & Karp, S. A. Psychological 
differentiation: Studies of mental development. 
New York: Wiley, 1962. 


(Received December 18, 1970) 


Journal of Educational Psychology 
1972, Vol. 63, No. 3, 218-224 


NEGRO CHILDREN’S USE OF NONSTANDARD GRAMMAR? 


SAMUEL J. MARWIT*, KAREN L. MARWIT*, AND JOHN J. BOSWELL 


Two Negro and two white examiners presented 93 Negro and 108 
white second graders with a task requiring them to derive the present, 
plural, possessive, and time extension forms of nonsense syllables, 
The hypothesis that white subjects would supply more standard Eng- 
lish forms and Negro subjects more nonstandard English forms was 
supported. The hypothesized characteristics of nonstandard English 
were upheld in all but one category. The possibility of Negro non- 
standard English being a distinct “quasi-foreign” language system and 


its implications were discussed. 


It has been noted for a long time (Kline- 
berg, 1935; Pasamanick & Knobloch, 1955) 
that Negro children, primarily those of 
lower socioeconomie status, appear deficient 
in language functioning. Many of these lin- 
guistic “deficiencies” are similar to those 
noted among white children of low socioeco- 
nomic status (Bernstein, 1961; Templin, 
1957) ; others appear specifically related to 
race (Deutsch, 1965). Most of the literature 
to date has been focused either on the rela- 
tionship of these deficits to specific cogni- 
tive impairments (Deutsch, 1965; John, 
1963; John & Goldstein, 1964; Klineberg, 
1935) or to those social conditions that 
might be responsible for the manifestation 
of such problems (Gray & Klaus, 1963; 
McCarthy, 1961; Milner, 1951; Nisbet, 
1961). Regardless, the traditional view of 
Negro children’s language is that it repre- 
sents a “substandard” language relative to 
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white middle-class norms and expectations 
(S. Baratz, 1968). 

Recently, however, some linguists and ed- 
ucators (Bailey, 1968; J. Baratz, 1969; 8. ; 
Baratz, 1968; Labov, 1967; Stewart, 1967, 
1968; Vetter, 1969) have come to regard 
"black language" as a uniquely different 
linguistic system from that of standard 
American English. Instead of considering it 
substandard American English, they have 
come to view it as nonstandard American 
English. They point out that black lan- 
guage follows a consistent and predictable 
set of phonological and grammatical rules 
that are highly elaborated and sophisti- 
cated, and different from those governing 
the standard English used by most white 
Americans. If this is the case, Negro chil- 
dren are approaching the traditional school 
situation with the overwhelming disadvan 
tage of speaking a “quasi-foreign language 
(Stewart, 1968) which is neither fully ree 
ognized nor openly accepted. The problems 
this poses in holding one’s own in reading, 
writing, communication, and concept m 
mation have been clearly illustrated bY 
Bailey (1968) and Vetter (1909). The? 
problems, according to Deutsch (1965), 8 i 
“cumulative” and therefore increase 0V° 
the child's academic career. 5 

Using standard English as a b 
point, the major distinguishing synt 
features of Negro nonstandard English a 
(a) the zero copula (absence of the due 
“is” in the present tense); (b) singular! 


ey 
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tion of plural objects; (c) the zero posses- 
sive (lack of a morphological possessive) ; 
and (d) the use of “be” to represent time 
extension. Examples of each of these, re- 
spectively, are, “He go,” “There are two 
hat,” “The man hat,” and “He be going.” 
Unfortunately, most of the literature per- 
taining to these nonstandard patterns has 
been descriptive and observational. Few at- 
tempts, if any, have been made to study 
them empirically. If Negro nonstandard 
English is, in fact, a well-ordered, highly 
structured, highly developed language sys- 
tem, we must assume, as does J. Baratz 
(1969), that by the time the Negro child is 
5, he has learned the rules of his linguistic 
environment. The present study investi- 
gates this by employing a design similar to 
that used by Berko (1958) in studying 
white children’s acquisition of the rules of 
standard English. Negro and white second 
graders were required to transform nonsense 
Syllables in ways designed to represent each 
of the above four distinguishing grammati- 
tal features. Nonsense syllables were used 
to insure that the child was responding in 
terms of internalized rules and not in terms 
of familiarity with preexisting vocabulary. 
It was hypothesized that for each category, 
White children would supply significantly 
More standard English forms and Negro 
children significantly more nonstandard 
ae forms of the variety described 
ove. 


MzrHOD 
Subjects 


i " total of 229 second graders from 10 classrooms 
MUS elementary schools in a St. Louis County 
fro id pum System were tested by two Negro and 
Tana examiners. Nineteen of these subjects 
eat carded because of an examiner’s failure to 
RA ae Standard instructions, 8 because informa- 
obtain vant to socioeconomic status could not be 
aned, and 1 because of oriental origin. The re- 

of aie sample consisted of 93 Negro subjects, 38 
Whits m were tested by Negro examiners, 55 by 
by Nem miners; and 108 white subjects, 49 tested 
iets oe examiners, 59 by white examiners. Sub- 
a aea subdivided into high, aida 

] " economie status by applying Hol- 
gepbend' (1958) Occupational scale from his Two 
his! oo ex of Social Position to subjects' par- 
rürily Cupation. High, middle, and low were arbi- 
Tepresented by Categories 1-3, 4, and 5-7, 
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respectively. Unfortunately, parental occupation 
was the only demographic datum provided by the 
schools. The absence of supportive educational 
and/or income information necessarily reduces the 
validity of the scale (Light & Smith, 1969) and any 
effects due to socioeconomic status must be inter- 
preted with this in mind. 


Apparatus 


Each subject was administered a test consisting 
of 24 ambiguous drawings each accompanied by 
sentences read by the examiner describing the 
drawing as either an object or a person engaged in 
some action. In all cases, the object or action was 
labeled by a nonsense syllable and presented to the 
subject in such a way that he was required to derive 
the present, plural, possessive, or time extension 
form of the nonsense syllable. The first four items 
were sample items offered to (a) familiarize the 
subject with the task and (b) to assure the exam- 
iner that his subject understood and was able to 
perform it. These were followed by 20 test items 
arranged sequentially such that each of the 5 
items assessing present tense was followed by one 
testing the formation of plural objects, followed 
by possessive, followed by time extension. This 
order was chosen to minimize the generalization of 
a set established on one item to any of the four 
related items. Examples of each test item and the 
order of presentation are given below: 


(1) Present tense. Stick figure reclining with 
legs crossed and head on hand. “This is a 
man who knows how to pid. /p10/. What is 
he doing now? Now he ——.” 

(2) Pluralization. One, then two figures re- 
sembling musical notes. “This is a lun 
/tan/. Now there is another one. There are 
two of them. There are two _____.”” 

(3) Possessive. Cup, lun holding cup. ‘This isa 
cup that belongs to the lun. Whose cup is it? 
Tt is the —__.” A 

(4) Time extension. Stick figure positioned for 
throwing. “This is à man who knows how 
to mork /mork/, He does this all the time. 
All the time, he SH 


The 18 nonsense syllables employed for the total 
24 items (12 used only once as in 1 and 4 above, 6 
duplicated as in 2 and 3 above) were selected 
from a total of 25 nonsense syllables on the basis 
of association values obtained from an independ- 
ent sample of 60 Negro and 22 white second graders 
from a school system other than the one under 
study (J. J. Boswell, K. L. Marwit, & S. J. Marwit, 
unpublished data, 1970). Those 15 syllables which 
had the lowest HN value and the highest 
ncy of independent responses were em- 
voted sn test stimuli. The next three highest in 
*tnonsensibility"" were employed as sample items. 
The remaining six were discarded from study. 
All sessions were recorded on Ampex 641-1/4- 
1800 tape using & Wollensak 1500 tape recorder at 


334 inches per second. 
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Testing Procedure 


Prior to testing, examiners attended four 2-hour 
training sessions, half of each being devoted to the 
practice testing of children (four per examiner) 
from schools other than those used in the study, 
and half devoted to a discussion of problems in test 
administration and to practice in the verbatim re- 
cording of subjects’ responses. Examiners were told 
that they were participating in a study of language 
development but were never informed of the hy- 
potheses being tested. They were instructed to ac- 
cept all subject responses as being “inherently cor- 
rect for that particular child at his particular stage 
of linguistic development.” Posttest interviews con- 
firmed each examiner’s ignorance of the purpose 
of the research. 

Testing for data collection was performed in 
rooms set aside by each school for the express pur- 
pose of conducting this study. Each subject was 
tested individually, Each was seated at a table op- 
posite the examiner and told that he was “about 
to play a little word game using a tape recorder” 
and that he was to speak directly into the micro- 
phone placed before him. The task was then intro- 
duced to the child as follows: 

We are going to play a silly word game with 

a bunch of silly words that somebody made up. 

I think you will find this a lot of fun. What I 

am going to do is this, I am going to say some 

sentences but I will leave off the last part. 

What you are to do is finish the last part for 
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me. OK? (Answer any questions that might 

arise). Now let’s practice. 

The examiner then administered the four sam. 
ple items which could be repeated for the child, if 
necessary. Examiners were not permitted, however 
to repeat anything more than the sentence stem, 
Most children comprehended the task by Item 2, 
all by Item 3. Practice was followed by the ex. 
aminer’s presentation of the 20 test items, for 
which no repetition was permitted. Each subjects’ 
responses were recorded verbatim in a test booklet 
which also provided space for his name, sex, age, 
race, and “comments.” 


Rating Procedure 


All tapes of all sessions were given to two 
student speech clinicians who independently 
recorded all subjects’ responses verbatim in 
test booklets identical to those used by ex- 
aminers. It was felt that “trained ears’ 
whose sole task was to listen and record 
would provide an accurate assessment of 
each subject’s responses as well as a relia- 
bility check on the examiners’ ability to re- 
cord these responses. Responses recorded by 
examiners and specch clinicians were then 
rated by the three principal investigators 
blind to subjects’ identifying information 


TABLE 1 
RATING SCALE AND EXAMPLES OF STANDARD ENGLISH, NoNsTANDARD ENGLISH AS HYPOTHESIZED, AND 
NowsTANDARD ENGLISH OTHER THAN HYPOTHESIZED 


Task and category Rating Example 
Present tense Now he—___—- 
SE 1 is pidding, pids 
NSE as hypothesized 2 pid 
NSE other than hypothesized 3 is pid 
NSE other than hypothesized 4 pidding 
No response 5 
Pluralization There are two ——- 
1 luns 
NSE as hypothesized 2 lun 
No response 5 gx 
Possessive It is the _——- 
SE 1 lun’s 
NSE as hypothesized 2 lun. 
No response 5 $i 
Time extension All the time, he —— 
Be 1 is morking, morks 
NSE as hypothesized 2 be morking 
NSE other than hypothesized 3 mork 
NSE other than hypothesized 4 morking 
No response 5 


Note.—Abbreviations: SE = standard English, NSE = nonstandard English. 
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The rating scale provided categories for 
standard English, nonstandard English as 
hypothesized, and nonstandard English 
other than hypothesized. This scale and ex- 
amples utilizing sentence stems from the 
sample items above are included in Table 1. 

A kappa coefficient (k) of agreement for 
nominal scale data (Fleiss, Cohen, & Ever- 
itt, 1969) was used to test interrecorder re- 
. liability. All ks were highly significant (p 
 « 001). Tests of significance between ks 
were nonsignificant. While the vast major- 
ity of test items showed triple agreement, 
those that did not showed at least double 
agreement. Thus, the "best two out of three" 
was defined as the criterion for obtaining 
scores for the final data analyses. 


RESULTS 


Individual Differences between Examiners 


It was decided, a priori, to initially test 
for differences between examiners. Univar- 
iate analyses of variance comparing all ex- 
aminers for each rating category for each of 
the four tasks revealed only one significant 
main effect. That was for the number of 
standard English forms elicited on the pres- 
ent tense task (F = 2.86, df = 3/197, p < 
05). A Duncan multiple-range test (Winer, 
1962) showed this to be the result of differ- 
ences between Negro and white examiners 
and not between examiners of the same 
tace. On this basis, both Negro examiners’ 
Scores were combined as were both white 
examiners’ scores, All ensuing analyses of 
ioe, therefore, employed two levels of 
oor race in addition to the two levels 
Es. ject race and three levels of subject 

loeconomie status. 


Standard English 


Te multivariate analysis of variance 
pm 0 test the hypothesis that white sub- 
Engli seal significantly more standard 
Di E forms than Negro subjects on all 
mo, Hs, is presented in Table 2. The 
"pau of standard English endings 
obtained by both races on all tasks can be 
compari from Table 3. While all four mean 
tion NI. are in the hypothesized direc- 
ject v a strongly significant effect of sub- 

ace was obtained, a Subject Race X 
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TABLE 2 
ANALYSIS OF VARIANCE OF NUMBER OF STANDARD 
AMERICAN ENGLISH Forms SUPPLIED BY 
Negro anD WHITE SUBJECTS ON 


Four Tasks 
Source df | MS F 
Between 
Examiner race (A) 1| 6.22) .55 
Subject race (B) 1 |140.07|12.45** 
Subject socioeconomic 2 | 34.96) 3.11 
status (C) 
AXB 1| 1.48.18 
AXC 2| 3,40| .30 
BXC 2 | 12.08) 1.07 
AXBXC 2| 3.4| .81 
Error 189 | 11.25 
Within 
Task (D) 3 | 62.8464. 58*** 
AXD 3 | 1.76| 1.80 
BXD 3| 5.17| 5,31** 
cxD 6 |) 1.37] 1.41 
AXBXD 3| 2.73) 2.80* 
AXCXD 6 73) . 15 
BXCXD 6| 2.97 3.05* 
AXBXCXD 6 .59| .60 
Error 878| .97 
BLS SE Ta ASA I M MEL E 
* p< 05. 
** p< 01. 
***p < 001. 


Task interaction was also obtained indicat- 
ing significant race effects on certain tasks 
only. Univariate analyses of variance ana- 
lyzing each task separately indicate signifi- 
cant effects of subject race on the plural (F 
= 7.78, df = 1/189, p < .005), possessive 
(F = 12.11, df = 1/189, p < 0001), and 
time extension (F = 20.03, df = 1/189, p < 
.0001) dimensions but not on the present 
tense task. : 

Two significant triple interactions were 
obtained. Observation of the relevant 
means indicates that the Examiner Race X 
Subject Race X Task interaction is the re- 
sult of Negro examiners eliciting more 
standard English from Negro subjects on 
all but the time extension task and from 
white subjects on all but the plural task. 
Whether this is primarily an examiner ef- 
fect with Negro examiners facilitating or 
white examiners suppressing the occurrence 
of standard English regardless of subject 
race, or an interactive effect dependent 
upon particular examiner—subject combi- 
nations cannot be ascertained from the 
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TABLE 3 
MEAN NUMBER OF STANDARD AMERICAN ENGLISH AND NONSTANDARD AMERICAN ENGLISH Ag 
HYPOTHESIZED Forms SUPPLIED BY NEGRO AND WHITE SUBJECTS ON PRESENT, PLURAL, 
POSSESSIVE, AND Time EXTENSION Tasks 
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Task 
Form Race " Present. Plural Possessive Time extension 
M SD M SD M SD M SD 
Standard White 108 2.65 | 1.94 | 3.91] 1.81 | 3.93] 1.75 | 4.00 1.78 
Negro 93 1.80 | 1.80 | 2.89 | 2.01 | 2.73 | 1.93 | 2.46 2.11 
Nonstandard | White 108 1.15 | 1.83 |1.06 | 1.78 | 1.04 | 1.71 -908 | 1.71 
Negro 93 1.75 | 1.92 |2.04| 1.99 | 2.25 | 1.95 |2.43* | 2.14 


* No nonstandard English as hypothesized, rated 2, was 


represent the mean number of 3 ratings obtained. 


present design, nor can the reason for the 
reversal of these effects in one of four cases. 
The Subject Race x Subject Socioeconomic 
Status X Task interaction was analyzed by 
applying Scheffé’s (1953) test to all pairs of 
mean differences in the amount of standard 
English endings supplied by each race on 
each task at each socioeconomic level (k = 
24). In all comparisons, white subjects sup- 
plied more standard English than Negro 
subjects. Neither Negro and white subjects 
of high socioeconomic status nor Negro and 
white subjects of middle socioeconomic sta- 
tus differed in their relative rates of supply- 
ing standard English endings to each of the 
four tasks. In other words, the functions de- 
picting both races’ performances across the 
four tasks at these socioeconomic levels 
were essentially parallel. On the other hand, 
a significant difference was obtained when 
comparing Negro and white subjects of low 
socioeconomic status in their relative rates 
of responding to the present and time exten- 
sion tasks as vs. the plural and possessive 
tasks (F = 53.27, F’(.o1 = 43.31). Plotting 
the means for these groups across tasks in- 
dicates nonparallel functions and suggests 
that the major contributing factor in the 
triple interaction is the differential rate of 
responding on the time extension task. 
Whether white subjects are overproducing 
or Negro subjects underproducing standard 
English forms on this task relative to their 
performance on the other three tasks cannot 
be determined, nor can the reason for this 
discrepancy occurring among subjects of 
one socioeconomic level only. 


obtained for time extension. Scores entered 


Nonstandard English 


Inherent in the white subjects’ signif 
cantly greater productivity of standard 
English is the implication that Negro sub- | 
jects respond significantly more with either | 
one or a number of nonstandard English 
forms. To determine whether these are of 
the variety hypothesized, a multivariate 
analysis of variance, similar in structure to 
that found in Table 2, was run for the total 
number of hypothesized nonstandard Eng- | 
lish forms obtained from subjects of both 
races on the present, plural, and possessive | 
tasks. Time extension was omitted from 
analysis because no nonstandard English a8 
hypothesized was obtained. In other words, 
no subject responded to the sentence stem 
“All the time, he ______—”” by supp 
“be” followed by the gerund form of the 
nonsense syllable. ni 

As can be seen in Table 3, all means ‘ad | 
the three comparisons are in the predict | 
direction. The analysis of variance -A 
played a significant effect of subject iP 
(F = 880, df = 1/189, p < 01) pen | 
significant Subject Race x Task interact 
| 
| 
* 


(F = 3.18, df = 2/378, p < 09) Ms 
complements results obtained in the ant 
describ 


sis of standard English forms zd 
above. Univariate analyses of Mens d 
lyzing each task separately again $ 9 

significant effects of subject race 
plural (F = 8.10, df = 1/189, P 5 dz 
and possessive (F — 12.92, df — Ue d 
.0005) tasks and again failed to d i 
nificance for the present tense variable. 


for the 
| 


005) 
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garding time extension, while no hypothe- 
sized nonstandard English forms were ob- 
tained, Negro subjects did consistently offer 
a nonhypothesized nonstandard English 
form. Significantly more Negro than white 
subjects responded to the time extension 
stem by supplying the stimulus syllable, 
without modification (F = 20.07, df = 
1/189, p < .0001) thereby obtaining a rating 
of 3 (see Table 1). 


DISCUSSION 


In general, the results support the hy- 
pothesis that white children supply more 
standard English endings to nonsense sylla- 
bles designed to represent the plural and 
possessive of nouns and the present and 
time extension forms of verbs, and that 
Negro subjects, consequently, supply more 
nonstandard English forms. Significant syn- 
tactical differences due to subject race were 
obtained on all but the present tense task. 
The hypothesized characteristics of Negro 
nonstandard English were supported for all 
but the time extension dimension. 

The failure to obtain significant subject 
Tace differences on the present tense task 
was a particularly unexpected finding. 
While it is possible that, in actuality, no 
differences exist, it is unlikely since it is this 
category, more than any other, that is re- 
ferred to in the literature when document- 
Ing racial differences in language functions. 

Second possibility is that differences do 
exist but that the grammatical rules in- 
volved are particularly difficult to learn and 
Are not incorporated by the time children 
Teach Second grade. However, this too is un- 
jm aM Since it is hard to see what is more 
i cult about learning these rules than 
Eos governing the other three tasks for 

ich significant differences were obtained. 
E likely, the failure to obtain signifi- 
E^ resulted from the investigators’ poor 

one 0f nonsense stimuli to this task. 
E. five words used to test present, tense, 
Eus Tis, another zub. To the first, sub- 
Es Sud respond with ris which would be 
Es Nonstandard English as hypothesized 
m the rated Standard English. The final s 
ee stimulus syllable makes the auditory 
ial mination of these forms difficult, espe- 
Y if the response is slurred or spoken 
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rapidly. Similarly, with zub, subjects could 
respond with either, “Now he’s zubbing,” a 
standard English form, or “Now he zub- 
bing,’ a nonstandard English form. Re- 
search now in progress has substituted more 
easily discriminable stimuli; that is, non- 
sense syllables not containing sibilants in 
the initial or final position, and should help 
determine whether or not the hypothesized 
differences exist. 

Regarding the preponderance of non- 
standard English given by Negro subjects, 
the hypothesized form was given to a sig- 
nificant degree on the plural and possessive 
tasks. A noteworthy but nonsignificant 
trend in this direction was also obtained for 
the present tense task. The failure to reach 
significance in this latter case is probably 
the complementary result of the poor choice 
of present tense nonsense syllables discussed 
above. The complete failure of the time ex- 
tension task to elicit any nonstandard Eng- 
lish as hypothesized was surprising. Hither 
the hypothesized form was incorrect or the 
sentence stem was improperly structured to 
elicit it. According to J. Baratz (1969) and 
others, “be” followed by the "ing" form of 
the verb in and of itself denotes time exten- 
sion for the Negro child. It is therefore pos- 
sible that the authors’ use of the stem “all 
the time” obviated the Negro subjects’ need 
to supply “be-ing.” To do so would have 
simply been redundant and poor grammati- 
cal form in any man’s language. Just what 
stimulus, if any, is required to elicit the 
hypothesized nonstandard form of time ex- 
tension must remain a question for future 
investigation. More important for the pres- 
ent hypothesis, however, is recognition of 
the fact that even though Negro subjects 
failed to respond with the hypothesized 
nonstandard form, they did supply an alter- 
nate nonstandard form with significant reg- 

rity. 
Mod consistent use of nonstandard Eng- 
lish forms by Negro subjects is probably 
the most remarkable finding of this study. 
It lends empirical support to those who 
have claimed that “black language" is a 
separate, highly consistent language with 
fixed grammatical rules that differ in par- 
ticular ways from the rules governing the 
language used by most white Americans. If 
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black language were nothing more than a 
substandard form of standard English, a 
sloppy array of nonstandard forms should 
have emerged. Instead, well-defined non- 
standard forms differing in set ways from 
standard English were elicited for the most 
part by each sentence stem. The problems 
inherent in a eulture supporting languages 
differing in grammar yet sharing the same 
vocabulary are too immense to be elabo- 
rated upon here. Yet, it seems imperative to 
note, in conclusion, that unless the distin- 
guishing features of one language are recog- 
nized and accepted by speakers of the 
other, no one stands to gain. 
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SOME ASPECTS OF THE CORRESPONDENCE BETWEEN 
CONTENT STRUCTURE AND COGNITIVE STRUCTURE 


IN PHYSICS INSTRUCTION 


R. J. SHAVELSON* 
Stanford University 


Forty subjects were divided at random into instruction (n = 28) and 
control (n = 12) groups to investigate the correspondence between the 
structure of the stimulus material (content structure) and the struc- 
ture of a learner’s memory during the learning (cognitive structure). 
Content structure was analyzed using digraph theory. Key concepts 
and their interrelations were the subject of analysis, All subjects were 
given achievement and word association pretests. Then, the instruction 
group read five sections of a text on Newtonian mechanics, one on 
each of 5 days. At the conclusion of each day, a word association test 
was administered. The achievement posttest was administered on the 
last day also. The control group did not receive instruction; these 
subjects received all tests. The structure of the content can be de- 
scribed as “tight” and “formal” according to digraph analysis, In 
the instruction group: (a) achievement increased significantly from 
pretest to posttest, (b) cognitive structure (word asociation data) 
changed considerably during instruction, (c) key concepts were inter- 
related more closely at the end of instruction than at the beginning, 
and (d) cognitive structure corresponded more closely to content 
structure at the end of instruction than at the beginning. Similar 
changes for the control group were not observed. 


A critical problem in the development of 
curriculum and in the formulation of in- 
struction is that of how to structure? a body 
of knowledge so that the communication of 
this knowledge to the learner can be effec- 
tive, and his learning correspondingly 
efficient. In the last 10 years, cognitive 
(structural) theories of learning have ex- 
erted considerable influence on attempts to 
Solve this problem of structure in instruc- 
tion, These theories have suggested struc- 
ture for instruction from their psychological 
—_— 


1 
f aequests for reprints should be sent to Richard 
fei ‘avelson, School of Education, Stanford Uni- 
bun ty, Stanford, California 94305. The author 
Pro hs to acknowledge his dissertation committee— 
ta "sors Snow, Cronbach, and Gage—for their 
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" Tirueture has been assigned many meanings in 
PR erature. In this study structure was de: 
pinoa Bge of identifiable elements and the 
nships between those elements. Structure 
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j ae objective and real or internal and sub- 
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view of learning. Typically, these theories 
have postulated a cognitive structure and 
have suggested that the structure of in- 
struction should make use of the postulated 
structure (Ausubel, 1963; Ausubel & Fitz- 
gerald, 1961, 1962; Bruner, 1966; Gagné, 
1962, 1965). | 

These events lead to a crucial question, 
“To what extent does the structure in the 
student’s memory, after learning, corre- 
spond to the structure in the instructional 
material?” This question of correspondence 
has been indirectly investigated by cogni- 
tive learning theorists (e.g. Ausubel & 
Fitzgerald, 1961; Gagné & Paradise, 1962) 
but this research is inconclusive; it has not 
isolated the structure of the instructional 
material from the structure of the student’s 
memory and studied the correspondence. 
The question of correspondence is critical; 
this question represents in broad terms the 
problem examined in this study. 

While a large number of studies have 
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dealt with content analysis (for a review, 
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Product 


see Berelson, 1954), digraph analysis of FORCE 


content (Frase, 1969; Kingsley, Kopstein, 
& Seidel, 1968; Regnier & de Montmollin, 
1968), and word associations as represent- 
ative of cognitive structure (Deese, 1962, 
1965; Johnson, 1964, 1965, 1967, 1969, 1970; 
Kiss, 1969; Rothkopf & Thurner, 1970), 
few studies have compared an analysis of 
content with an analysis of memory. John- 
son (1967, 1969) found that the frequency 
with which concepts occurred in text was 
directly related to the frequency of asso- 
ciates given to those concepts on a word 
association test. 

The first step was to identify the 14 key 
concepts (MOMENTUM, INERTIA, 
POWER, MASS, TIME, WORK, 
WEIGHT, ACCELERATION, FORCE, 
DISTANCE, VELOCITY, IMPULSE, 
SPEED, and ENERGY). 


As a basis for selecting the stimulus words...a 
frequency count was made of the words which rep- 
resented concepts in Newtonian mechanics in Ss’ 
textbook (Dull, Metcalfe, & Williams, 1960).... 
Fourteen words were selected from this word count 
so as to represent as much of the frequency range 
as possible, under the restriction that the list of 
words include six concepts whose definitions within 
the text consisted of simple physical equations 
[Johnson, 1967, p. 78]. 


Next, every sentence and equation in the 
text containing two or more of the key con- 
cepts was diagrammed using the procedure 
suggested by Warriner and Griffiths (1957) ; 
there were 170 diagrams in all. Then, each 
diagram was converted into a digraph using 
rules reported by Shavelson (1970). For ex- 
ample, the sentence, “Force is the product 
of mass and acceleration” was diagrammed 
as: 


ACCELERATION. 


Using the conversion rules, the following di- 
graph was obtained (see Figure 1). 

The symmetric relation between FORCE 
and product is specified by the rule for link- 
ing verbs; a linking verb does not specify 


MASS 
ACCELERATION 


Fic. 1. Digraph obtained for the sentence; 
Force is the product of mass and acceleration, 


action and is to be digraphed as a symmet- 
ric relation between two points. The sym- 
metric relation between product and AQ. 
CELERATION is specified by the rule for 
prepositions; if the preposition does not 
specify direction, the relation is digraphed 
symmetrically. The absence of a line sym- 
metrically connecting MASS and ACCEL- 
ERATION follows from the rule that 
whenever two words or group of words are 
joined by a coordinating conjunction (eg, 


and), those words are digraphed independ- . 


ently of each other. : 

The distance between two points on a di- 
graph is the number of lines in the shortest 
path connecting the two points. Only those 
digraphs representing the shortest distante 
between pairs of concepts received furiher 
analysis. One hundred seventy digraphs 
were reduced to 52 in this manner. These 
digraphs contained key concepts and con 
cepts lying in a path between them. In the 
display above, the key concepts of FORCE, 
MASS, and ACCELERATION and the con 
cept of product were contained in the di- 
graph. To combine all 52 sentence digraphs 


into 1 digraph representing content struc: 


ture, an adjacency matrix was formed. his 
matrix contained the key concepts and th? 
concepts lying in a path between tl 
entry ay = 1 is made in an adjacency me 
if a line leads from Point i to Point j 1n. 
digraph; it contains the entry 45 = cd 
line does not connect Point i with pons 
Once the adjacency matrix was formed 10 
all 57 digraphs, this matrix was oor 
into a distance matrix (a distance ma 
contains the distances between pairs H 
points on the digraph) using proce dn 
given by Harary, Norman, and Cartwng 
(1965, pp. 135—136). 
Cognitive structure is à 


hypothetical con 
struct referring to the organization 


(rela 


hem. The — 


the 
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tionships) of concepts in memory. It is in- 
vestigated by the method of word associa- 
tion. The underlying assumption is that the 
order of response retrieval from long-term 
memory reflects at least a significant part 
of the strueture within and between con- 
cepts. Therefore, the order of responses gen- 
erated by each subject to each concept 
takes on particular importance. These or- 
dered distributions yield the following in- 
formation about cognitive structure: (a) 
what Nobel (1963) calls “meaningfulness” 
of the concept; and (b) the relationships 
between these concepts (cf. Deese, 1962). 

If content and cognitive structures can be 
represented objectively and independently 
—even if the structures are not represented 
in their complete form or without some dis- 
tortion—a beginning has been made toward 
an answer to the crucial question, “To what 
extent does the structure in the student’s 
Memory correspond to the structure in the 
instructional material?” 

In this study, a segment of physics was 
taught to students from a textbook (Dull et 
al., 1960). Instruction took place over a 5- 
day period. The subjects took a word asso- 
ciation test prior to the initiation of in- 
struction and following instruction on each 
of the 5 days. The purpose of this study was 
to examine the correspondence between di- 
Staph analysis of content structure and 
Word associations. This, then, was a first 
step toward answering the question of cor- 
Tespondence, 


Mertuop 
Instructional. Material 


3 ae instructional material was taken from Mod- 
a hysies by Dull et al. (1960, pp. 39-114) It 
s divided into five instructional packages, one to 
M ites on each of 5 consecutive days. Instruc- 
Pom Packages 1-5 included pages 39-56, 56-81, 
tie ere and 106-115, respectively. Johnson 
(1970) yo, 1997, 1969) and Rothkopf and Thurner 
have reported studies using this material. 


Instrumentation 


me 4 stimulus words identified for digraph 
ysis of the instructional material were con- 
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tained in the word association test. One of the 14 
words was printed at the top of each page, and 
beneath it were two columns, each of 15 horizontal 
lines. The stimulus word appeared to the left of 
each horizontal line. The word association test con- 
tained 1 page of instructions and 14 pages for re- 
sponses. Since the word association test was ad- 
ministered six times—at pretest and then after 
each day of instruction—six versions were pre- 
pared. Each version was a random ordering of the 
14 key concepts; all of the subjects received the 
same ordering on a given day. The subjects were 
instructed to write as many words as they could 
think of when presented a key concept. They were 
allowed 1 minute for each key concept. 

As a methodological check on learning, two 
forms of an achievement test, with 30 items in 
each form, were constructed. For the 28 subjects in 
the instruction group, the internal consistency co- 
efficient alpha for Form A was .69 and for Form B 
was .74; the intertest correlation was .58. 


Subjects 


High school students who had not yet taken a 
course in physics but expressed interest in learn- 
ing about it were enlisted for the study. They were 
promised payment of $15 plus a bonus (up to $3) 
determined by their final achievement. The 40 
volunteers whose schedules could be accommo- 
dated to that of the experiment were divided at 
random into instruction (n = 28) and control (n = 
12) groups. 


Facilities 
Two lecture rooms at Stanford University, each 


large enough to accommodate 60 persons, were 
used, 


Procedures 


The study was carried out over a 6-day period. 
The first day was devoted to subject orientation 
and pretesting. At that time, all subjects were 
instructed that this was a study to find out about 
how students learn concepts in physics 80 that 
teachers could be trained to teach this subject 
matter better. Following orientation, the subjects 
took aptitude tests (not reported here), a word as- 
sociation test, and then the achievement test. Half 
of the subjects received Form A and half Form B. 

Subjects receiving instruction (instruction 
group), read the five instructional packages, in 
succession, one on each of the subsequent 5 days. 
They were instructed to read the text and to work 
the sample problems in the text. At the conclusion 
of each day’s instruction, a word association test 
was administered to all subjects in the instruction 
group. At the conclusion of the last day’s instruc- 
tion, the alternate form achievement posttest was 
administered along with the sixth word association 


The control group met in a second room on the 
5 days following orientation and pretesting. The 
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urpose of the control group was to determine the 
Becta of repeated exposure to physics concepts 
on the word association and achievement tests. 
They, were instructed that they were going to re- 
ceive a number of physics tests to determine how 
much physics they could remember from past 
school courses. The first 2 days following pretesting 
were devoted to testing the control subjects. Five 
word association tests and then the alternate form 
of the achievement test were administered. The 
control group students, therefore, received the 
same number of tests as did the instruction sub- 
jects, but in a condensed period of time. 

Once the control group students = kso 
the physies testing, they participated in a y 
Foo pilot study. Their bonus was calculated 
from their score on the pilot study’s achievement 
posttest. All 40 subjects completed work on the 
sixth day at the same time. 

Design 

Half of the subjects received Form A of the 
achievement test as a pretest and Form B as a post- 
test; for the remaining subjects, order of testing 
was reversed. The word association test was ad- 
ministered in a repeated measures design. All sub- 
jects took the word association test at pretesting, 
and then five additional times. Word association 
testing for the instruction group followed each of 
the subsequent 5 instructional days. Word asso- 
ciation testing for the control group was condensed 
into a 8-day period. 


RESULTS AND Discussion 


Content Structure 


The adjacency matrix for the 14 key con- 
cepts contains information about content 
structure. The original matrix was a 66 X 
66 design. It included the key concepts plus 
other concepts found in the path between 
key concepts. Rather than present the en- 
tire matrix, data on the 14 key concepts are 
presented in Table 1. The absence of zeros in 
this table means that none of the concepts 
is isolated from any of the others; each is a 
carrier. Hence the instructional material 
has a tight or formal structure, 

The indegree of a point on the digraph (1 
of the 14 key concepts) is the number of 
lines from other concepts directed to that 
concept; the outdegree gives the number of 
lines from that concept to other concepts. 
Adding indegree and outdegree gives what 
is called the total degree. The total degree 
represents approximately the number of 
different points with which a given point is 

connected. However, caution is required in 
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TABLE 1 
Summary OF ApJACENCY MATRIX 
oso 


Key concepts Indegree* | Outdegree? |Total degree! 
Momentum 9 4 13 
Inertia 2 3 5 
Power 3 4 7 
Mass 17 15 32 
Time 7 14 21 
Work 10 6 16 
Weight 6 7 13 
Acceleration 12 10 22 
Force 20 17 37 
Distance 12 12 24 
Velocity 15 14 29 
Impulse 3 3 6 
Speed 6 4 10 
Energy 7 8 15 


* Indegree — the indegree value is the number 
of lines to that point on the digraph. 

b Outdegree = the outdegree value is the num- 
ber of lines from that point on the digraph. 

* Total Degree — the sum of indegree and out- 
degree. 


interpreting Table 1. Since the adjacency 
matrix corresponds to a digraph derived 
from 57 select digraphs, it does not repre- 
sent the total number of lines between key 
concepts and other words in the text. In this 
study, the total degree for each concept in- 
dicates the frequency with which the text 
employs the key concepts in their closest 
relationships. Six of the 14 concepts have 
high total degrees: FORCE, MASS, VE- 
LOCITY, DISTANCE, ACCELERA- 
TION, and TIME. Of the 250 lines in the 
digraph connecting the 14 key concepts, 
these 6 concepts involved 187 of the con- 
cepts or 68%. n 

From the 66 x 66 adjacency matrix, 5 
distance matrix was constructed. The strik- 
ing feature of this matrix is that no p 
tance between two concepts exceeds " 
lines, and this occurred only twice; tl 
modal distance is two. This is conss 
with the knowledge that all key concepts 3 
the adjacency matrix are carriers. Suc 
digraph is said to be “strong.” 
Cognitive Structure 

This section is divided into two patt 
First, evidence that learning odoin es 
presented to justify further investigatio 


cognitive structure. Then the subjects’ 008- 
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nitive structure and the way this structure 
changed across the 5 days of instruction are 
described. 


Evidence of Learning 


If a subject were merely guessing, he 
would be expected to answer approximately 
7 questions (out of 30) correctly on either 
form of the achievement test. On the aver- 
age, both groups were above this level at 
pretesting (X = 8.84, s = 4.63 for control 
group; X = 10.04, s = 3.10 for instruction 
group). The two groups did not differ sig- 
nificantly in achievement at pretesting (t = 
100, « = .05). The control group did not 
improve significantly from pretest to post- 
test (X posttest = 9.58, s = 3.70; t = 1.36, 
a= (05). The instruction subjects improved 
significantly from pre- to posttest (X post- 
test = 16.18, s = 444; F = 111.03, df = 
1/26, a = .05); test order was not signifi- 
cant (F = .15, df = 1/26, « = .05); Form 
A was more difficult than Form B (X = 
12.11 and 14.16, respectively; F = 11.77, df 
= 1/26, « = .05). 

The word association test may be used to 
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Fig. 2. Mean response frequency on the word 
association test for the instruction group and the 
control group for each test day. 


investigate learning also. Noble (1963) de- 
fined “meaningfulness” to be directly pro- 
portional to the number of associates given 
to a stimulus word. As a subject learns 
physies, the key concepts should increase in 
meaningfulness, and hence the average 
number of responses to each concept should 
increase. This was essentially what oc- 
curred (Figure 2). The control group data 


TABLE 2 
PRETEST AND Posrrest WORD ASSOCIATION DATA FOR A “TYPICAL” CONTROL 
SUBJECT AND A “TYPICAL” INSTRUCTION SUBJECT 


Pretest Posttest 
Subject 
Force Mass Acceleration Force Mass Acceleration 
_- 
Instruction Energy Amount Time Energy Density Deceleration 
Amount Measure | Amount Impulse Force Speed 
Magnetic | Density Speed Time Momentum Rate 
Electric Volume Rate Distance ‘Time 
Density Acceleration | Speed Measure 
Deceleration | Rate Velocity 
Speed Velocity Distance 
Velocity Time 
Friction 
Measure 
Distance _ 
Control Push Volume Pick-up 
Volume Speed-up | Push n 
Pull Great n Act Density Speed 
Drag Density Built-up Weight Change 
Weight Space Pull Compound | Motion 
Lever Earth Space 


n Note.—The underlined words appear in more than one response distribution for a given subject on a 


given test, 
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indieated that instruction group perform- 
ance reflected exposure to instructional ma- 
terial in addition to practice on the word 
association test. The initial inerease in con- 
trol responses can be attributed to practice 
effects. 

In Table 2, typical pre- and post-word 
association data generated by a control 
subject and an instruction subject for three 
key concepts exemplified the foregoing re- 
sults. Responses given by both subjects at 
pretesting and the control subject at post- 
testing were quantitatively and qualita- 
tively similar. They reflected the ways in 
which concepts are manifested in the natu- 
ral world. The posttest responses of the in- 
struction subject were quantitatively differ- 
ent from the other three lists and appeared 
to be qualitatively different, too. These 
posttest responses rely on concepts which 
measure and define the key concept. 


FORCE Equals MASS Multiplied by 
ACCELERATION 


If a subject responded with concepts that 
described the way in which these three con- 
cepts occurred in ordinary language, little 
overlap between responses to each key con- 
cept would be expected. In the example 
(Table 2), the underlined responses appear 
in more than one response distribution; that 
is, they are common associates to two or 
more key concepts. This expectation is con- 
firmed in the example by the absence of 
underlined responses for the control and in- 
struction subjects at pretest and the control 
subjects at posttest. If a subject responded 
to these three concepts knowing that they 
could define each other, there should have 
been considerably greater overlap. This is 
consistent with the instruction subject’s 
posttest data shown in the example in Table 
2. 


This overlap of associates to key concepts 
is at the focus of what is meant by cogni- 
tive structure in this study. The following 
subsection addresses itself to a detailed 
analysis of the word association data for 
cognitive structure. Further discussion of 
the learning measures may be found in 
Shavelson (1970). 
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Analysis of the Word Association Test for 
Cognitive Structure 


The relatedness coefficient (Garskoff & 
Houston, 1963) incorporated the response 
frequency to a given stimulus word with the _ 
overlap between response distributions for 
pairs of stimulus words and thus provided a 
procedure for describing cognitive structure 
from word associations consistent with our 
definition. 

The formula for Garskoff and Houston 1 
(1963) Relatedness Coefficient is: 


f m 
RO” EB EEE 
where 


+A and B represents the rank order of | 
words under A which are shared in com- 
mon with B and the rank order of words 
in B which are shared in A. 

-A-B represents the rank order of words 
in A multiplied by the rank order of 
words in B. 

en represents all of the words in the 
longer list. 

*p represents some fixed number greater 
than zero which may be determined 
from the shape of the probability distri- 
bution of the responses; p equalled 1 in 
this study so that all portions of the 
subject’s response distribution received 
equal weight. lu 

From the example in Table 2, the instruc- 

tion subject at pretesting gave the following 
responses to the concepts of FORCE and 
ACCELERATION: 


FORCE Rank ACCELERATION Rank 
5 


Kone: 5 Acceleration, 
Energy 4 Time $ 
*Amount 3 *Amount 2 
Magnetic 2 Speed 
Electric 1 
A = (6432,0; B = (5,4,3,2); 4-B = 6) 
RC = 3-3 ; 
= 64821) /8 — [p — (6 — D! 
4 
3 
2 
d 
RC = 15. 


The relatedness coefficient (RC) matrices 
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foreach subject were formed and then com- 
bined as median matrices. This was done 
separately for instruction- and control- 
group subjects. Each cell, then, represented 
the median relatedness coefficient for a con- 
cept pair. The relatedness-coefficient matrix 
was symmetric. 

The relatedness-coefficient matrices, then, 
represented the relationships among the key 
concepts in memory at each of the six test 
administrations. If no new information was 
studied, responses to key concepts remained 
qualitatively similar; seldom should two re- 
sponse distributions share the same asso- 
ciate, Thus, relatedness coefficients should 
not increase appreciably across the six con- 
trol-group relatedness-coefficient matrices. 
And, if a median relatedness coefficient were 
calculated for each control relatedness-coef- 
ficient matrix, the six median relatedness 
coefficients should be very small and uni- 
form across matrices. 

„The instruction subject’s word associa- 
tions should change qualitatively. His asso- 
dates on the word association posttest are 
tonstrained primarily by concepts in New- 

. tonian mechanics. Since these concepts 

- should be closely interrelated, pairs of re- 
sponse distributions should share many as- 
sociates in common. For the instruction 
group, the six median relatedness coeffi- 
cients should increase across the six test ad- 
ministrations. 

Median relatedness coefficients for control 
and instruction group relatedness-coefficient 
Matrices are shown in Table 3. Control me- 
jm relatedness coefficients are low and uni- 

j sh across the six tests; instruction-group 
ca lan relatedness coefficients increase 
oss the 6 days. Table 3 presents evidence 


M TABLE 3 

E 

4 Mi RznATEDNESS CorzrrrcreNT IN EACH 
"LATEDNESS-CoEFFICIENT MATRIX ACROSS 
AYS FOR INSTRUCTION AND CONTROL 


Groups 
Di 
Group A 
SS NNNM 1 2|3|4| 56 
struction 0.00/0.09|.15|.22| -27).32 
ontrol (0.00]0. 00) .06} .04}0.00) .02 
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TABLE 4 
EUCLIDEAN Distance MATRIX FOR ALL PAIRS OF 
Instruction GROUP AND ALL PARS OF 
Conrrot-Group RELATEDNESS COEFFI- 
CIENT MATRICES 


CESD Instruction-group days 
1 2 3 4 5 6 

1 1.14 | 1.46 | 2.00 | 2.30 | 2.61 
2 -99 -93 | 1.58 | 1.79 | 2.04 
3 1.24| .75 1.08 | 1.26 | 1.62 
4 1.26 | 1.12 | .96 .68 | 1.06 
5 1.83| .89| .87| .95 .80 
6 1.12| .78 75 | 1.03 | .82 


Note.—Instruetion-group range = 1.93; co- 


trol-group range — .58. 


of increasing constraint by content struc- 
ture on a subject’s cognitive structure. 

Additional evidence for the constraint of 
content structure on a subject’s cognitive 
structure is provided by a comparison of 
entire relatedness-coefficient matrices with 
each other. To investigate the relation be- 
tween relatedness-coefficient matrices, Eu- 
clidean distance between pairs of related- 
ness-coefficient matrices was calculated.* 
Euclidean distance represents the absolute 
distance between two matrices. The greater 
the distance value, the greater the dissimi- 
larity between two matrices. 

Buclidean distances between control 
group matrices should be relatively small 
and uniform. Less stability is expected for 
distances between instruction group matri- 
ces. The greatest distance should occur 
when instruction-group relatedness-matri- 
ces are compared at Days 1 and 6, but not 
necessarily for the controls. 

Table 4 presents the Euclidean distances 
between all pairs of instruction-group relat- 
edness-coefficient matrices above the diago- 
nal and the same comparisons for the con- 
trol group below the diagonal. 


“The author wishes to thank Professor Olkin 
for this suggestion. Euclidean distances between 
pairs of matrices are calculated as follows: (a) a 
difference score is calculated for each corresponding 
pair of cells in the two matrices, (b) these dif- 
ferences are squared, (c) the squared differences 
are summed, (d) the square root of the sum of the 
differences squared divided by the number of cells 


is calculated. 
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) TABLE 5 
EUCLIDEAN Distance MATRIX For THE CORRE- 
SPONDENCE BETWEEN THE ÍNSTRUCTION- AND 
CONTROL-GROUP MATRICES 


Instruction-group days 
pak, 
akh, 2 3 4 5 6 
1 .70 | 1.41 | 1.72 | 2.23 | 2.50 | 2.82 
2 86] . 1.11.| 1.71 | 2.03 | 2.35 
3 1.04 | .79| .88 | 1.58 | 1.83 | 2.15 
1 1.07 | 1.13 | 1.22 | 1.78. | 2.04 | 2.34 
5 .93 | 1.01 | 1.12 | 1.71 | 2.01 | 2.32 
6 .87 | .92| 1.17 | 1.79 | 2.03 | 2.34 


Inspection of the datà below the diagonal 
in this table confirmed the hypothesis of 
stable, relatively small distances between 
control-group relatedness-coefficient, matri- 
ces (range — .58 with a median distance of 
99). Above the diagonal, the distances for 
the instruetion group are considerably less 
stable (range — 1.93) ; the median distance 
is 1.46. 

The hypothesis that the greatest distance 
between two instruction-group relatedness- 
coefficient matrices should occur at Days 1 
and 6 was confirmed. Column 6 of Table 4 
shows a consistent decrease in distances 
from 2.61 at Day 1 to 80 at Day 5 when 
compared with Day 6. No such consistent 
decrease is found in row 6 for the control 
group. 

These results indicated a fairly stable 
cognitive structure for the control group 
and a shifting structure for the instruction 
group. If this interpretation is. sound, 
the instruction-group relatedness-coefficient 
matrices earlier in instruction (Matrices 
1-8) should be more similar to the six 
control-group relatedness-coefficient matri- 
ces than instruction-group matrices later in 
instruction (Matrices 4-6). Table 5 pre- 
sents the Euclidean distances for each con- 
trol-instruction pair of relatedness-coeffi- 
cient matrices. The data on the diagonals 
and in column 6 and row 6 of Table 5 con- 
firm this hypothesis. 

If each: of the key concepts was repre- 
sented by a point such that the interpoint 

distances corresponded in some sense to the 
similarities between concepts, Kruskal’s 
(1964) multidimensional scaling procedure 
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could be used to investigate the word asso. 
ciation data. The relatedness-coefficient 
matrices were considered similarity matrices 
and sealed. The solution for the control 
group matrices should be more similar to in- 
struction-group matrices on Days 1-3, but 
differ from instruction-group data on Days 
4-6. In short, subjects receiving instruction 
were expected to interrelate the 14 key con- 
cepts in memory differently than the control 
subjects. 

A different prediction follows from an al- 
ternate hypothesis. As Johnson (1967, 
1969) observed, many Newtonian mechan- 
ies concepts occur in ordinary language, 
They describe the natural world. These 
“prescientific” meanings many times inter 
relate certain concepts in a manner consist: 
ent with Newtonian mechanics. For exam. 
ple, SPEED and VELOCITY are often 
used interchangeably in ordinary language. 
Adding this information to the finding that 
the subjects entered this study with some 
knowledge of Newtonian mechanics, the 
multidimensional scaling solutions for the 
control groupsmight be similar to the solu: 
tions for the ‘instruction group across relat- 
edness-coefficient matrices. 

Two-dimensional sealing solutions wete 
selected since they matched the data fairly 
well (stress « .10) and provided essentially 
the same clusters as the three-dimensional 
solution. The alternate hypothesis was con 
firmed. Four main concept clusters emerge 
on the word association pretest and T€ 
mained across tests for both groups. Cluster 
1 includes the concepts of FORCE, WORK, 
POWER, and ENERGY; Cluster 2 
cludes the concepts of MASS p 
WEIGHT; Cluster 3 includes the concepts 
of DISTANCE and TIME; and cud 
ineludes the concepts of VELOCIT?, 
SPEED, and ACCELERATION. 


Correspondence between Content m 
and Cognitive Structure 
nt structure 


The digraph analysis of conte 
indicated a tight, formal structure. Ta 
word association data reflecting cogniti, 
structure show that a subject entere y 
study with some verbal skills pr 
and knowledge of, Newtonian meo Bn 


As a result of exposure to the instructional 

material a subject's achievement and re- 
sponse frequency to key concepts increased. 
Also, the key concepts became more closely 
interrelated by a subject during the course 
of instruction. The data presented in this 
section bear on the extent to which changes 
in memory due to learning are constrained 
by the structure in the text. 

One way to investigate the correspond- 
ence between content structure and cogni- 
tive structure is to determine the similarity 
between the digraph-distance matrix (con- 
tent structure) and the relatedness-coeffi- 
cient matrices (cognitive structure). Since 
the control group did not encounter the in- 
structional material, Euclidean distances be- 
| tween control relatedness-coefficient matri- 
- ees and the digraph-distance matrix should 
bestable, and approximately as distant from 
the digraph matrix on Test 1 as on Test 6. 
Similar comparisons using instruction-group 
telatedness-coefficient matrices should show 
An increase in similarity (i.e., a decrease in 
distance) from Day 1 to Day 6. 

Euclidean distances were computed be- 
tween each relatedness-coefficient matrix 
and the converted digraph-distance matrix.’ 
The results of this analysis are presented in 
Table 6. The Euclidean distances decreased 
between relatedness-coefficient and digraph 
matrices for the instruction group, but not 
for the control group. This suggested that 
content structure influenced the organiza- 
„tion of concepts in memory. However, the 
evidence does not suggest a near perfect 
correspondence between content structure 
and cognitive structure. 


To further investigate the correspond- 
eed 


‘ances 
| Staph distance was 2 and the maximum distance 
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TABLE 6 
EUCLIDEAN DISTANCE MATRIX: DISTANCE BE- 
TWEEN CONTENT STRUCTURE AND COGNITIVE 
STRUCTURE FOR INSTRUCTION AND CONTROL 
SunzECTS across Days 


Days 
Group 
1 2 3 4 5 6 
Instruction 6.49/5.92/5. 53/4. 90/4. 52/4. 22 
Control 6.69/6.28/6.07/6 . 17/6. 176.25 


ence, the digraph-distance matrix was 
scaled using Kruskal's (1964) procedure. A 
three-dimensional solution (stress = .045 
versus stress = .15 in two dimensions) pro- 
vided an adequate fit of the data. Three 
clusters appear. Cluster 1 includes the con- 
cepts of WORK, POWER, and ENERGY; 
Cluster 2 includes the concepts of 
WEIGHT, MASS, and MOMENTUM; and 
Cluster 3 includes the concepts of ACCEL- 
ERATION and VELOCITY. INERTIA re- 
mains at a constant distance from Cluster 
2; SPEED is remotely related to Cluster 3. 

These three clusters are similar to three 
of the four clusters which emerged from the 
eontrol-group and instruction-group relat- 
edness-coefficient matrices at pretesting. 
This helps to explain why three of the four 
clusters remained in the instruction group’s 
solutions throughout instruction. These find- 
ings are also in agreement with other data, 
indicating that the subjects participating in 
this study were familiar with the concepts 
presented in instruction. 

In summary, learning (achievement) and 
cognitive structure (word associations) 
data for the instruction subjects indicated 
that cognitive structure changed considera- 
bly- during instruction; achievement in- 
creased significantly from pretest to post- 
test; and key concepts in Newtonian me- 
chanics were interrelated more closely at 
the end of instruction than at the beginning. 
These changes were not observed for the 
control subjects. The cognitive structure of 
the instruction subjects corresponded more 
closely to the content structure at the end 
of instruction than at the beginning; no 
similar change was observed for the control 
group. All subjects (N = 40) entered the 
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study with prescientific meanings for New- 
tonian mechanics concepts; that is, the sub- 
jects interrelated key concepts in a manner 
consistent with the structure of Newtonian 
mechanics. This preinstruction structure 
provided a basis for acquiring the input 
(content) structure. 
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Differences in the rate of successive acquisition of four concepts as & 
result of two presentation conditions and their interaction with the 
learner's preferred strategy were investigated. Concept instances were 
either intermixed or blocked during presentation. Sixty undergradu- 
ates were given three tasks designed to measure learner strategy. They 
then learned four concepts involving the identification of geometric 


attributes. The overall rate of su 


ccessive acquisition of concepts fol- 


lowing the first was decidedly slower with the mixed presentation, 


implying more interference from intermix 


ed instances than from 


blocked instances. The one strategy measure that interacted with pres- 
entation conditions indicated that learners who randomly formulated 
hypotheses were not strongly influenced by presentation conditions; 
those who manifested a systematic strategy, however, benefited by 


the blocked presentation. 


Research on sequence of instances in con- 
cept. learning pertains to the broader area 
i sequencing of instruction. The concept- 
pu research of the type reviewed for 
i present study applies most readily to 

e procedure in which the instructor di- 
tects the presentation of events and out- 
me for the learner, but requires the 
wa to discover the generalization him- 
te à 4 example of such a procedure would 
B iology instruetor's use of pictures or 
B of animals with the classification 
hol or each to teach students the bases 
am dena classifications. Another exam- 
usie p music professor’s presentation of 
E^ eae eras to lead students to 
of differ e bases for distinguishing music 
Bu: ves eras. In such instructional set- 
E Meg ioptivo teacher might ask the 
BE What order of presentation of ex- 


V This 
brülsbhs iens was supported by funds made 
t Ia the Advanced Research Projects 
‘fee of NE ug Order No. 1269) through the Of- 
Seg eah under Contract N00014- 


LI 

ate for reprints should be sent to Nich- 
thology, pence Department of Educational Psy- 
cial Bates p a State University, 201 Bo- 
ania 16802. ding, University Park, Pennsyl- 


amples will provide the most efficient in- 
structional strategy? Should the teacher pre- 
sent examples of the various classifications 
(or eras) in some intermixed sequence in 
order to demonstrate both similarities and 
differences among the classifications (or 
eras), or should the teacher group all ex- 
amples of each classification (or era) to- 
gether in order to emphasize the particular 
characteristics of each? 

Underwood (1952) theorized that the sec- 
ond method, the grouping of concept in- 
stances contiguously, would produce more 
rapid learning since the difficulty of remem- 
bering prior instances, in order to abstract 
or induce the relevant properties, is mini- 
mized. When instances of the same concept 
are grouped contiguously, identification of 
the relevant stimulus properties depends 
primarily upon set or attention. However, 
when instances of different concepts are in- 
termixed, the necessity for “minimizing 
memory strain” is introduced for the learner 
(Bruner, Goodnow, & Austin, 1956). Thus, 
even though the learner may accurately 
perceive all the properties of one instance, 
some or all of the properties may be forgot- 
ien in the period intervening before the 


presentation of the next instance of that 


concept. 
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Dominowski (1965), in a review of stud- 
ies related to the role of memory in con- 
cept learning, concluded that Underwood's 
(1952) prediction concerning the effects of 
instance contiguity had been well supported 
in studies using a variety of stimulus ma- 
terials and experimental procedures. Thus, 
there is overwhelming evidence that an in- 
structor should present examples of each 
concept classification consecutively, in 
blocks, rather than intermixing the exam- 
ples of various concepts, 

The present study was designed to extend 
previous investigations in two ways. First, 
no previous study has traced the course of 
learning in the two conditions. Does the ac- 
quisition of a concept in an intermixed in- 
stance presentation sequence eliminate in- 
terference from instances of the acquired 
concept in learning the additional con- 
cepts? That is, does the acquisition of a 
concept have the effect of increasing con- 
tiguity of the unlearned concept instances? 
If contiguity were effectively increased by 
learning a subset of the set of concepts to be 
learned, one would predict greatest differ- 
ences between blocked and intermixed in- 
stance conditions in learning the first con- 
cept and least differences in learning the 
last concept. However, if instances of ac- 
quired concepts do interfere with attainment 
of unlearned concepts, the rate of learning 
in the two conditions would be constant, 
and attainment of the last concept would 
exhibit greatest differences between condi- 
tions. 

The second elaboration on the topic stems 
from a suggestion made by Kurtz and Hov- 
land that the effect of variation in instance 
contiguity may depend on the general man- 
ner in which subjects set about to learn 
concepts: 


At one extreme, some Ss seem to ‘randomly’ for- 
mulate and test various possible hypotheses, while 
at the other extreme, some Ss carefully study the 
concept instances presented in an attempt to ‘in- 
fer’ the common properties, reserving the choice 
of a hypothesis until sufficient data are available. 
...It seems likely that, among Ss who actively at- 
tempt to abstract the common properties of several 
instances before formulating a hypothesis, the 
unmixed order of presentation would be relatively 
easier than the mixed order, but among Ss choos- 
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ing hypothesis by trial and error the difference 
might be considerably reduced [1956, pp. 242-243]. 


Similar differences in subjects’ concept- 
learning strategies have been described by 
many psychologists (cf. Bruner et al., 1956; 
Sechrest & Wallace, 1962; Wickelgren, 
1964). For reference purposes, subjects who 
tend to “randomly formulate and test hy- 
potheses” shall be termed hypothesis spew- 
ers and those who reserve judgment in an 
effort to infer common properties shall be 
called conservative strategists. In the pres- 
ent study, measures were devised to differ- 
entiate hypothesis spewing and conservative 
strategy tendencies according to the Kurtz 
and Hovland (1956) description, in order to 
test their proposal of the differential effects 
of instance contiguity as a function of 
learner strategy. 


Merxop 


Subjects 


The subjects were 60 students, 21 males and 39 
females, who volunteered for participation in this 
experiment. They were enrolled in an introductory 
educational psychology course and received extra 
om credits in the course for participating in the 
study. 


Materials 


Stimulus patterns were presented on 3 X 5 cards 
and were generated from five dimensions with 
three values each. The color of the card i 
(black, gray, or white) was one dimension, In 
center of the card was a 1/2 X 3 inch white surfac 
on which the four additional dimensions were 
varied: number of letters (one, two, or three); ied 
cific letters (A, B, or C); colors of letters (red, 
blue, or yellow); and emphasis by under- or ove 
scoring, or both (a single horizontal line above 0 
below the letter[s] or a line both above and below; 

Of the set of 243 possible stimulus patterns a 
pable of being generated from combinations $i 
these dimensions, only those 81 that were mero 
of one of the four concepts to be learned, p i 
additional patterns, were used. The specitic (b) 
cepts were: (a) one letter on a black card, él 
one letter on a gray card, (c) two letters un 5 
lined, and (d) two letters with lines above. 7 E 
values of the relevant dimensions on the 12 a% 
tional patterns, used as nonexamples during dor 
trials, were either one letter on a white CAT 
two letters with lines both above and belor BAF, 

dhe nennt labels, NUZ, KEY, Ue js 
were on 5 X 7 cards. The 
were from the medium meaningfulness on 
range on Archer's (1960) norms, and were ved. 
to maximize distinctiveness among letters invo 
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Procedure 
All of the subjects were administered the task 
individually. The subjects were divided equally 
into the two experimental conditions (i.e. blocked 
or mixed presentation of concept instances) and 
were alternately placed in the two conditions as 
they appeared for the experiment, except in cases 
when there was not enough time between sessions 
to rearrange stimuli for the other condition. 
Learning blocks of trials were alternated with 
test blocks of trials. A learning block consisted of 
the successive presentation of 16 instances each of 
which was accompanied by the experimenter's ver- 
balization of the concept label. Subjects did not 
respond overtly during this period. The test block 
of trials consisted of the successive presentation of 
six instances to which subjects responded with a 
concept label. Also, the subject was asked to hy- 
pothesize the relevant dimensions for each con- 
cept but he received no feedback during this test 
block. This procedure was continued until the sub- 
jects correctly identified all six instances and cor- 
rectly verbalized all four concepts. 
; In the blocked presentation condition, the sub- 
jects saw four consecutive examples of each of the 
four concepts during the learning trials. Order of 
presentation of each of the blocks (ie. four ex- 
amples of the same concept) of four concepts 
Within each learning trial was randomized across 
trials. During the learning trials in the mixed 
presentation condition, the subject saw four ex- 
amples of each of the four concepts, but each 
block of four examples contained one example 
of each concept, not four examples of the same 
concept. Presentation of concepts was rando: i 
Within each block of four examples, with the 
testriction that no two examples of the same con- 
cept could appear consecutively. The experimenter 
egan the session by explaining that he was in- 
enting how people formed concepts and in- 
ks luced the subjects to four sample cards similar 
$ the ones to be used in the actual task. The five 
d of color letter, number of letters, 
The esp and lining of letter was delineated. 
Sono e subject was asked to describe the four 
of itp e cards giving all five values present on each 
Ph P cards, At this point the subject was told he 
Mn learn four concepts of two values each. The 
i With the concept labels was turned over, the 
dur pronounced each label for the sub- 
and explained they would be used as names 
for the concepts, 
it re first deck of examples was presented, one 
Mod ea at 2-second intervals. The concept label 
the rub with each example was pronounced to 
Po — was shown. After brio 
^ earning trial were presen! 
P: Ced was asked what he thought the two val- 
indi at made up each of the four concepts. 
‘ues mentioned by the subjects were re- 
l0 gu On a data sheet. The experimenter wrote 
feeds e if the subject said he didn't know. No 
was given to these responses. Then a test- 
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trial set of six cards was presented one at a time 
and the subject was asked to identify the 
concept exemplified by the card or to say it was 
not an example of any of the four concepts. In 
every test trial there was an example of each of 
the four concepts plus two cards which were not 
examples of any of the concepts. (The subject 
was not informed of these characteristics of the 
set of test instances.) The subject had not previ- 
ously seen the four test examples of concepts; these 
examples were always used as test trials before 
their presentation as examples during a learning 
trial. The experimenter recorded whether the sub- 
ject responded correctly or incorrectly to each of 
the test examples presented, but gave no feedback 
to the subject. 

There were six decks of 16 cards employed for 
the learning trials. The order in which the decks 
were presented was randomized for each subject. 
The procedure of a learning trial followed by & 
test trial was repeated until the subject could 
correctly verbalize the two values of each concept 
and could identify the four concept instances and 
two negative instances of the test trial without a 
mistake. If the subject had not attained a cri- 
terion performance in the six learning and test 
trials, the experimenter repeated the presentation 
of the six decks of learning and test instances in 
an order different from the first presentation of 
those decks until criterion performance was 
achieved. 


Learner Strategy Measures 


Immediately following the concept-learning 
task, subjects were given three additional tasks in 
the order presented below. The tasks were designed 
to assess various aspects of the differences among 
individuals suggested by Kurtz and Hovland 

1956). 
| Concept learning by selection. Using the selec- 
tion paradigm for study of concept-learning strat- 
egy, the experimenter presented 16 stimuli ar- 
ranged in four equally spaced rows on & 24 X 12 
inch gray poster board. The stimuli were on 14- 


midpoint of the backgro: " ^ 
The subject was instructed that this task was 
similar to the one just completed, The four di- 


was given. The subject was 
ee of that concept found on the board. 


select other pictures, one at & time, and the ex- 
perimenter 
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not an example of the concept. The subject could 
guess the two attributes of the concept at any 
time. If his verbalization was correct, the task was 
completed. If it was incorrect, the experimenter 
said, “No, that is not the concept,” and the task 
was continued. 

Responses were recorded on a tape recorder and 
were later scored for latency of selections, number 
of values mentioned in hypotheses for each selec- 
tion, and differences in stimulus dimensions be- 
tween the focus instances and the first instance 
selected by the subject. Delay in making selections, 
fewer number of hypotheses produced, and the 
variation of only one dimension with the first 
selection were considered indications of the con- 
servative strategy in concept learning. 

The Many Good Uses Test. The next task was 
a specially devised Many Good Uses Test. The 
subject was told that he would be given the names 
of five familiar objects, one at a time. After each 
object was named, he was to give as many good 
uses of the object as he could think of within 
1 minute. Number of uses was assumed to vary as 
a function of the subject’s concentration on many 
uses (hypothesis spewer) or good uses (conserva- 
tive strategist). Newspaper, shoe, cork, chair, and 
nickle were used as stimuli. All responses (even 
repetition) were recorded on a data sheet by the 
experimenter and were summed to measure “ran- 
dom formulations of hypotheses.” 

Selj-report on learning strategy. The final task 
was a self-report inventory of 30 items. Twenty 
items were constructed to determine hypothesis 
spewing (10 items) or conservative strategy (10 
items) from the subject’s rating of his own be- 
havior or of how another person would rate his 
behavior. A single score was obtained for these 
20 items: A high score indicated a self-report of a 
conservative strategist, indicating the person as 
thoughtful, unimpulsive, reflective, and wanting all 
details of a situation before acting. From the Test 
Anxiety Scale for Children (Sarason, 1960), 10 
items were chosen as being most relevant to test- 
taking anxiety in a college setting and were re- 
worded for college students, that is, teacher was 
changed to professor and school was changed to 
college. The 10 anxiety items were used as filler 
items. 


RESULTS 


Analyses were made of (a) the number of 
trials required to reach correct verbaliza- 
tion of the concept and (b) the number of 
trials required to identify correctly the six 
test instances. The differences between the 
means of the blocked and mixed presenta- 
tion on each criterion yielded a £ = 1.80, 
p < .05 (one-tailed test). The means and 
standard deviations of these data are pre- 
sented in Table 1. These results indicated 
that both criteria were reached more rap- 
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TABLE 1 
MzaNs AND STANDARD DEVIATIONS OF TRIALS on 
WEICH SUCCESSIVE CRITERIA WERE 
ATTAINED IN BLOCKED AND MIXED 
PRESENTATION CONDITIONS 
——— a 


Condition Statistic Criterion 1 
Number of instances 
correctly identified 
4] 29 -Mfsim| ree 6 
Blocked a 1.4311.87/2.533.00/3.7014.07 
SD  |1.39/1.52/1.71]1.81/1.952.20 
Mixed E 1.13/1.53/2.33/3.40/4. 4044.97 
SD .43| .92/1.19/1.08]1. 6611.63 
Criterion 2 
‘Number of concepts 
correctly verbalized 
1 2 3 4 
Blocked x 2.18 | 2.53 | 3.47 | 4.00 
SD 1.56 | 1.84 | 2.16 | 2.48 
Mixed ee 2.13 | 2.70 | 4.20 | 4.97 
SD 1.09 | 1.07 | 1.42 | 1.76 


idly when concept instances were blocked 
rather than intermixed. ; 
Differential trends in the rates of acqui- 
sition under the blocked and mixed presen 
tation conditions were examined in two 
mixed analyses of variance. In both analy- 
ses, the presentation conditions comprised 
the between-subjects variable and the lev- 
els of successive criteria (Underwood, 1966, 
pp. 451-454) of concept attainment com- 
prised the within-subjects variables. In onè 
analysis, the dependent variable was UP 
number of trials in which successive 0n- 
teria of one, two, three, and four contes 
were correctly verbalized; in the other, s 
dependent variable was trials on which su 4 
cessive criteria of one through six od 
instances were correctly identified. P ‘ 
analyses yielded F = 4.06, df = 3/1 i P 
« .01 for the interaction between ene ‘a 
presentation and trials to reach the ver 
zation criterion, and F = 5.76, df = 5 i 
p < 01 for the interaction between prosi 
tation mode and trials to reach the iden xi 
cation criterion. The nature of the inter 
tions may be traced from the means 12 


f 
ble 1 where it can be seen that the rate ? 


EEE 
———————————— —— 


EFFECTS OF CONCEPT INSTANCE SEQUENCE 


acquisition is more rapid under blocked 
than under the mixed presentation. The 
main effects for these analyses are not re- 
ported because the successive criteria con- 
version made them difficult to interpret. In 
particular, the between-subjects main effect 
involves comparing the mean number of 
trials required to reach an average number 
of successive criteria (which was 2.5 con- 
cepts with the verbalization criterion and 
35 instances with the identification cri- 
terion) under the two modes of presenta- 
tion. 

Another set of analyses were carried out 
to test the notion suggested by the Kurtz 
and Hovland (1956) prediction that learn- 
ers who randomly test hypotheses (hypoth- 
esis spewers) will be less affected by the 
intermixing of concept instances than the 
systematic learner (conservative strategist). 
Regression coefficients for the relationships 
of each of the five specially devised strategy 
measures with the two dependent variables 
were obtained for the blocked and the mixed 
presentation conditions. The prediction of 
an interaction between presentation condi- 


a o 
T 


E 


Fo Blocked 


nro 
r 


TRIALS TO CONCEPT VERBALIZATION 
NN 


ELLE tt 

O |5 20 25 30 35 40 45 50 
MANY GOOD USES TEST 

us l. The effect of blocking and intermixing 

with i Instances as a function of learner strategy, 
and hich Scores indicating conservative strai 

Tegre igh scores indicating hypothesis spewers. (The 

and um lines are drawn for points 1/2 SDs above 

Test) elow the mean on the Many Good Uses 
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TABLE 2 
CORRELATIONS oF EACH STRATEGY VARIABLE 
WITH THE INSTANCE IDENTIFICATION AND 
CONCEPT VERBALIZATION CRITERIA 
IN THE Two PRESENTATION 


CONDITIONS 
Number of trials| Number of 
„to correct |trials to correct 
Strategy variable identification | verbalization 
Blocked| Mixed |Blocked Mixed 
Concept learning by 
selection task: 
Average latency in 
making instance 
selections* .07| —.20| .12 |—.15 
Average number of 
hypothesized 
values per trial? —.02| —.03| .07 |—.04 
Number of values in 
the first selection 
differing from the 
focus instance> 06] —.17| .12 |—.26 
Many Good Uses Test? .98| —.27| .26 |—.27 
Self-report of strategy* .10| .20| .23 | .18 


* High scores indicated conservative strate- 


gists. 
b High scores indicated hypothesis spewers. 


tion and learner strategy was then tested by 
comparing the coefficient based on data for 
subjects in the blocked presentation with 
the related coefficients based on data for 
subjects in the mixed presentation. (The 
analysis was based on the one ordinarily 
used for homogeneity of regression prior to 
an analysis of covariance.) Ten F ratios 
were so computed, one for each comparison 
of the 10 combinations of strategy measures 
and dependent variables. Only one of the 
five strategy measures, the Many Good 
Uses Test, produced a significant interac- 
tion with presentation condition. This test 
yielded F = 4.07, df = 1/56, p < .05 for 
the effect due to the interaction between 
the Many Good Uses Test and mode of 
presentation on the verbalization criterion; 
and F = 3.52, df = 1/56, p < 10 for the 
same interaction when the concept instance 
jdentification criterion was employed. As 
shown in Figure 1, this interaction between 
strategy and presentation condition sup- 
ports Kurtz and Hovland’s contention that 
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the performance of conservative strategists, 
when compared with hypothesis spewers, is 
hindered by the mixed presentation. The 
related correlation coefficients for all five 
measures are presented in Table 2. 


Discussion 


The finding that blocking instances leads 
to more rapid concept acquisition than in- 
termixing instance provides additional sup- 
port to the conclusions reached by Domi- 
nowski (1965) in his review of the literature 
on the effects of concept instance contigu- 
ity. Since the procedures of the present 
study differed from previous studies by the 
use of alternation of training on new sub- 
sets of instances and new test instances, 
greater generality of the effects of blocking 
instances has been obtained. 

The alternation of learning and testing 
procedures also made it possible to trace 
the differential history or course of learning 
under the blocked and mixed presentation 
procedures. The data indicated that the at- 
tainment of concepts, once the first concept 
was learned, was slower in the mixed than 
in the blocked presentation condition on 
both criteria. The comparison of rate of 
concept attainment in the mixed condition 
with that in the blocked supports the view 
that learning one concept does not eliminate 
the interference the instances of that con- 
cept have on learning other concepts. Thus, 
not only is the interspersing of examples of 
concepts less effective than blocking in- 
stances, but, in view of the present findings 
it may be said tentatively that even in- 
stances of similar concepts already attained 
by the learners should not be interspersed 
with instances of a concept not yet learned 
by the students. 

The suggestion (Kurtz & Hoyland, 1956) 
that subjects who randomly test hypotheses 
would be less affected by the intermixing of 
instances than subjects who systematically 
test hypotheses was only partially sup- 
ported, since (a) only one of the five meas- 
ures of strategy revealed the suggested in- 
teraction, and (b) the graphing of that 
interaction seemed to imply that hypothesis 
spewers acquired concepts less rapidly in 
the blocked than in the mixed presentation 
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condition. Both of the above issues may 
have been the results of the relatively un- 
refined development of the measures. How- 
ever, on further investigation of the second 
issue, the authors found too few cases (only 
6 of the 60 were over three points above the 
regression lines’ intersection point) to judge 
the difference between blocked and mixed 
conditions for hypothesis spewers a reliable 
finding. With regard to the first issue raised 
above, it should be noted that the Many 
Good Uses Test, the measure that did sup- 
port the prediction, was the only strategy 
measure that was similar to the experimen- 
tal task in stimulus presentation method. 
Both the experimental task and the Many 
Good Uses Test required responses to ex- 
perimenter-presented stimuli (a reception 
paradigm) while the three measures based 
on the selection concept-learning task in- 
volved the learner in the stimulus presenta- 
tion process (a selection paradigm). In sum- 
mary, the present study provided a weak 
support for Kurtz and Hovland’s (1956) 
suggestion; further exploration with learner 
strategy measures utilizing the reception 
paradigm seems warranted. 

In conclusion, one may say that inter- 
mixing concept instances—whether the in- 
stances are of a previously attained similar 
concept or of a yet unacquired concept— 
decreases the rate of acquisition of new con- 
cepts. This decrease in rate most likely re- 
sults from memory interference caused by 
the instances of other similar concepts. It is 
also conceivable that another factor, the 
learner’s reliance on memory, may affect 
the extent to which the intermixing of con- 
cept instances results in a slower rate of con- 
cept acquisition. That is, if a learner ran- 
domly chooses hypotheses to test, in a man- 
ner relatively unaffected by memory of past 
instances, the effect of interference in mem- 
ory would be decreased. This latter factor, 
however, has as yet received only weak sup- 
port. 
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IMAGERY AND PROSE LEARNING' 


RICHARD C. ANDERSON? anD RAYMOND W. KULHAVY* 


University of Illinois 


High school seniors instructed to form mental images while reading & 
2,000-word textbooklike passage learned no more than those merely 
asked to read carefully. On a postexperiment questionnaire more 
than half of the control group reported using imagery while about one- 
third of the group instructed to use imagery said they did not. Per- 
formance on the posttest was an increasing function of the amount of 
time during which imagery was reportedly used. 


Image-evoking value is the most impor- 
tant determiner of the learnability of words 
(Paivio, 1969). Instructions to form mental 
images strongly facilitate paired-associate 
learning (Bower, in press; Paivio & Yuille, 
1967). People asked to create mental im- 
ages of the events described in sentences 
learn two to three times as much as people 
who read the sentences aloud again and 
again (Anderson, 1971; Anderson & Hidde, 
1971). Because of the practical implica- 
tions, it is worth discovering whether im- 
agery instructions similarly facilitate learn- 
ing from prose passages of the type found in 
textbooks. This was the purpose of the pres- 
ent study. 


METHOD 
Subjects and Materials 


The subjects were 62 seniors from a small town 
high school. The prose passage was the 2,190-word 
description of a fictitious primitive tribe employed 
by Anderson & Myrow (1971). The criterion 
measure contained 34 short-answer items and 34 
four-alternative multiple-choice items with stems 
in one-to-one correspondence to the short-answer 
items. The information tested varied from specific 
(names and quantities) to general (the character- 
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istics of the family system). The short-answer 
items were graded on the basis of prespecified 
scoring rules which permitted variations in syntax 
and word choice as long as the substance was cor- 
rect. The multiple choice test was corrected for 
guessing using the formula, R-W/3. There was 
& nine-item postexperiment questionnaire which 
asked about the subject’s study strategy, careful- 
ness, and interest in the materials. 


Procedure 


The imagery group was instructed to form & 
vivd mental picture of everything described in the 
booklet whereas the control group was asked only 
to read the booklet carefully. Subjects were fore- 
warned of the test. The study was run in one large 
group session. Subjects were assigned to groups by 
issuing booklets, identical except for the first page 
of instructions, which had been stacked in 8 
random order. When a subject finished reading 
the passage, his study time was recorded, he was 
given the short-answer test, and then he was 
given the multiple-choice test. Finally, the sub- 
ject completed the questionnaire. 


RzsurrS AND DISCUSSION 


Analysis of posttest variance failed to 
show a significant effect for instructions (F 
= 1.79, df = 1/60), test mode (F = 2-41, df 
= 1/60), or the interaction of these factors 
(F = 3.33, df = 1/60). Nor were there dif- 
ferences between groups in study time (t = 

The questionnaire asked whether or not 
the student tried to use “mental pictures” to 
learn. If the answer was yes, a subsequent 
question asked whether he had done so just 
at the beginning, for half the booklet, or for 
the entire booklet. Four subjects failed to 
answer one or both of these questions. Re- 
sponses of the remaining subjects indicated 
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----— IMAGE INSTRUCTIONS 
—— CONTROL INSTRUCTIONS 


A 


ES 


ra 


f 


/ 


PERCENTAGE CORRECT 


NONE, HALF ALL 


ALMOST NONE 
AMOUNT OF PASSAGE 
Fia. 1. Percentage correct on the posttest as a 


function of the amount of the passage for which 
the use of imagery was reported. 


that more than half of the control group 
employed imagery while studying the pas- 
sage. About one-third of the group that re- 
ceived imagery instructions reported not 
using imagery or doing so only at the begin- 
ning of the passage. Figure 1 shows the 
mean percentage correct on the posttest as a 
function of instructions and the reported 
consistency with which imagery was em- 
ployed. An unweighted means analysis of 
variance indicated that performance was 
related to the imagery reports (F = 8.27, df 
= 2/52, p < 01; o? = .18). Also significant 
was the Imagery Report x Instructions in- 
teraction (F = 3.30, df = 2/52, p < .05; o? 


= .01) which appears to be due to the very 
poor performance of students who reported 
they did not use imagery even though they 
were given imagery instructions. 

From this evidence, it appears that a per- 
son will learn more from a prose passage if 
he forms images of the things and events 
described in the passage. However, it also 
appears that for a passage of as many as a 
couple thousand words in length, peoples’ 
tendency to employ imagery is inade- 
quately controlled by simple preliminary 


. instructions. Future research might concern 


itself with procedures to evoke and main- 
tain “imaging.” 
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EFFECTS OF PASSAGE ORGANIZATION AND NOTE TAKING ON 
THE SELECTION OF CLUSTERING STRATEGIES AND ON 


ology 


RECALL OF TEXTUAL MATERIALS' 


CHARLES B. SCHULTZ? anp FRANCIS J. DI VESTA’ 
Pennsylvania State University 


The 48 subjects were given three study-recall trials to learn one of 
three passages in which statements were organized by concept name, 
concept attribute, or in a random order. Half of the subjects in each 
group were permitted to take notes during the study periods while the 
remainder read the material without recourse to note taking. The two 
organized passages, when compared with the randomly ordered pas- 
sage, resulted in significantly more recall and influenced the selection 
of clustering strategies congruent with the organization of the pas- 
sage. High concept-name clustering scores were consistently obtained 
on all three trials when the passage was organized by concept name. 
High concept-attribute scores were obtained when it was organized by 
concept attribute only on the third trial. The results implied that the 
latter strategy was adopted gradually. The performance of subjects 
who studied the concept-attribute passage without recourse to notes 
was impaired on the first trial compared to the performance of sub- 


jects who studied the concept-name passage. 


When given the task of learning randomly 
ordered lists of words, subjects tend to adopt 
a clustering strategy during learning and re- 
call in which the words are subjectively or- 
ganized into experimentally defined cate- 
gories (Bousfield, 1953) if they are highly 
dominant, or into idiosyncratic categories 
(Seibel, 1964) if they are not. The impor- 
tance of this finding is that subjective orga- 
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nization facilitates memory of learned ma- 
terials. : 
Frase (1969) extended the investigation 
of organizational strategies in free recall to 
passages comprised of simple sentences. 
Each sentence in the passage expressed an 
association between a concept name and a 
value of a concept attribute. The following 
sentence is an illustration drawn from one 
of Frase's experimental passages on playing 
chess: “The pawn concept name moves in à 
forward direction concept attribute.” As in 
the subjective organization of a single list 
of words, typically used in the free-recall 
paradigm, the learner of connected dis- 
course has the option of using different 
clustering strategies. Accordingly, he can 
group by name, that is, he can group state- 
ments about all the attributes of the same 
concept together or, he can group by at- 
tribute, that is, he can group all statements 
about the same attribute for each of the 
concept names. Moreover, a given passage 
can be experimentally arranged in at least 
three ways: sentences can be grouped by 
mame (concept name), sentences can be 
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grouped by attribute (concept attribute) or, 
they can be sequenced in a random order. 

The contiguity of items belonging to the 
same category appears to affect both recall 
and clustering. When the presentation of a 
word list was blocked, so that all the words 
in the same category appeared consecu- 
tively, clustering in free recall was more 
frequent than when the words were mixed 
(Cofer, Bruce, & Reicher, 1966). Moreover, 
recall of the blocked list was more efficient 
than that of the mixed list when presenta- 
tion was rapid (ie., at 1-second intervals). 
In regard to connected discourse, Frase 
(1969) found that recall of statements in 
passages blocked or organized by either 
name or attribute was superior to recall of a 
randomly ordered passage. However, the 
two organized passages were not equally 
effective in their influence on the amount 
of clustering or the subject’s choice of a 
clustering strategy; the passage organized 
by names resulted in more clustering than 
did the one organized by attribute. The 
tendency to employ the name clustering 
strategy appeared dominant, since subjects 
who read the attribute and random passages 
clustered more by name than the passage 
they read. 

The present experiment extended the gen- 


> eralizability of earlier studies (Cofer et al., 


1966; Frase, 1969) by examining clustering 
and recall in a passage that is more closely 
analogous to materials used in instructional 
settings. Accordingly, typical social science 
content was selected as the topic of the 
passage. In addition, the sentences con- 
tained modifiers and parenthetical phrases 
and the order of the concept name and con- 
cept attribute elements within the sentence 
was varied. It was expected that these 
changes would not alter the organizational 
effects of the passage on recall and cluster- 
ing. 

A primary purpose of this experiment was 
to investigate the conditions under which 
the subject’s clustering deviates from the 
passage organization. In Frase’s (1969) 
experiment, the subjects were given the op- 
portunity to take notes while they studied 
the passage. However, we reasoned that 
note taking during the study period may 
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have the effect of influencing the learner to 
change his clustering strategy from the or- 
ganization implicit in the passage to one 
of his own choosing. Thus, there would be 
more variation in clustering strategies 
among subjects who take notes than among 
those who learn without notes, since notes 
may provide a means of “external storage” 
and a device to rearrange the organizational 
pattern of the passage in stereotypic fash- 
ion. Without a device for external storage 
and its potentiality for normative or even 
idiosyncratic transformation, the learner 
must rely more heavily upon the original 
organizational pattern in the passage. 

It was also reasoned that when the learner 
relies on a passage organization consistent 
with his dominant clustering strategy (e.g. 
organization by name) during learning he 
would employ that organizational mode 
during recall with the effect of facilitating 
what is remembered. However, when a sub- 
ject is forced to rely on a form of passage 
organization during learning which is incon- 
sistent with his dominant strategy (e.g., at- 
tribute organization), as he must when he 
does not take notes, then he must relinquish 
the strategy he normally employs and adopt 
a different less-preferred technique. As a 
consequence of employing a subordinate and 
less-practiced strategy, recall would suffer. 

The aforegoing rationale suggests the fol- 
lowing hypotheses: (a) The effect of pas- 
sage organization on the learner's clustering 
in recall is greater without notes than it is 
with notes; (b) When the learner must de- 
pend upon the passage organization, the 
adoption of the dominant (Frase, 1969) 
concept-name strategy to organize the ma- 
terial is relatively spontaneous, while adop- 
tion of the less-preferred concept-attribute 
strategy to organize the material is gradual; 
(c) Acquisition during the early stages of 
learning a passage organized by attribute 
is impaired relative to acquisition during the 
early stages of learning a passage organized 
by name. 


METHOD 
Design 


High school juniors and seniors were given three 
brief study periods to learn a passage which de- 
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scribed six imaginary nations. The study periods 
were each followed by a writing period during 
which a free-recall test was administered. Measures 
of both recall and clustering of responses were ob- 
tained from the free-recall test. Three levels of 
passage organization (concept name, concept attri- 
bute, and random sentence sequence) were orthog- 
onally crossed with two note-taking conditions 
(note taking and reading only), These manipula- 
tions imply a 2 X 3 X 3 factorial analysis of var- 
iance with repeated measures (trials) on the last 
factor. 


Subjects 


The subjects were eleventh- and twelfth-grade 
students from a local high school, who were in the 
upper 20% of their class in academic rank. They 
were randomly assigned to six experimental con- 
ditions. None had participated previously in any 
experiment. 


Stimulus Materials 


The basic experimental passage was constructed 
according to procedures described by Frase (1969). 
The passage consisted of statements about six 
imaginary nations such as Brontus, Bismania, and 
Nurovia, Six characteristics were described for 
each nation (eg. its geographic features, socio- 
economie stage of development, and type of gov- 
ernment) resulting in a matrix of six nations (con- 
cept names) by six characteristics (concept attri- 
butes) as summarized in Table 1. Statements were 
constructed for each cell in the 6 X 6 matrix. For 
example, the following sentences were based on the 
row of attributes describing geographic features: 

ee is marked by an extensive system of 

es. 
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Most of Brontus is plain, level land. 

Machst the southern part of Nurovia is desert 

land. 

Atweena is an island nation. 

Galbian is a land-locked nation. 

A mountainous terrain characterizes much of 

Egrama. 

Three different sets of materials were devel- 
oped: One was based on the organization of state- 
ments according to concept name, that is, the state- 
ments were derived from the contents of the col- 
umns of the matrix (concept name). A second was 
organized by concept attribute, that is, the state- 
ments were derived from the contents of the rows 
of the matrix (concept attribute). A third consisted 
of arranging the sentences in a random order, 

The organization of the passage was evaluated 
with the same procedure used to compute cluster- 
ing ratios in the free-recall protocols, The concept- 
name and concept-attribute clustering ratios were 
separately computed according to a formula used 
by Frase (1969) as follows: 


Number of Repetitions 


(Total Number of Sentences Recalled) — 
(Number of Categories Recalled) 


X 100. 


The number of adjacent repetitions by a given 
name was used to determine the concept-name 
clustering ratio and the number of adjacent repe- 
titions by a given attribute was used to compute 
the concept-attribute clustering ratio. In the de- 
nominator, the number of categories recalled was 
subtracted from the total statements recalled be- 
cause the first statement in each category can not 
be considered a repetition. According to this pro- 
cedure, the percentage of organization by name for 


TABLE 1 
CONCEPT AvrRiBUTE BY Concept NAME MATRIX EMPLOYED IN THE 
CONSTRUCTION OF THE PASSAGE 


Concept name 
Concept attribute 
Bismania Brontus Nurovia Atweena Galbion Egrama 
Type of soci- 
ety modernized, | urban urban urban and | urban and | industrial 
well-de- manufac- indus- 
2 , veloped turing trial 
Socioeconomic 
conditions | national social un- | full employ- | peace and economic | political fac- 
unity rest ment prosperity | depres- tions 
sion 
Geography | lakes level, plain| desert island Tand. mountainous 
land locked 
Government | one party democracy | autocratic representa- | totalitar- | coalition 
tive re- ian 
T publie 
Population 
growth increasing decrease | increase increase decrease decreasing 
Death rate 18 per 1,000 | 15 per 1,000} 10 per 1,000 | 12 per 1,000 | 14 per 1,000| 11 per 1,000 
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the concept-name passage was 100%, for the con- 
cept-attribute passage, 0%, and for the random- 
order passage, 7%. The percentage of organization 
by attribute for the concept-name passage was 0%, 
for the concept-attribute passage, 100%, and for 
the random-order passage, 17%. 


Procedure 


Upon entering the experimental room, two sub- 
jects were seated at separate desks. The subjects 
were told that they were about to participate in a 
learning experiment in which they were to study a 
passage containing descriptions of a number of 
imaginary nations. Their task was to remember as 
many of the statements from the passage as possi- 
ble. They were further told that they would be 
given three 5-minute study periods each of which 
would be followed by a 6-minute writing period. 

Organization treatments. One experimental di- 
mension consisted of three levels of passage orga- 
nization as described in the Materials section. Any 
one subject was administered only one passage, 
either concept name, concept attribute, or random 
order. 

Note-taking treatments. The passage organiza- 
tion conditions were orthogonally crossed with two 
note-taking treatments. Half of the subjects as- 
signed to study each of the passages in the three 
organizational treatments were given instructions 
to the effect that they could take notes (ie. the 
note-taking treatment) to help them remember 
the passage during the study period. They were 
told the notes would not be available to them 
during the writing period. The remaining subjects 
(ie, those in the reading-only treatment) were not 
offered an opportunity to take notes. No mention 
of notes was made in the instructions to these 
groups, nor were they provided with paper on 
which notes might have been recorded. 

Scoring. The free-recall protocols were scored 
for the number of statements correctly recalled, 
the number of errors, and clustering ratios for both 
concept name and concept attribute. In order to 
be counted as a correct statement, the concept 
name (or an approximation of the correct name) 
must have been associated with the correct value 
of a concept attribute. A further constraint in pro- 
cedure applied to the scoring of compound attri- 
bute statements such as, “Bismania enjoys peace 
and prosperity." In this case, either peace or pros- 
Perity was required if the statement was to be 
Scored as correct. In the case of a single attribute 
statement such as, “Brontus is an urban society,” 
the answer may have included an incorrect attri- 
bute value as long as the correct attribute value 
was also included. Thus, the answer, “Brontus is an 
industrial and urban society,” was scored as cor- 
rect. Clustering ratios were computed according to 
the formula employed by Frase (1969) and de- 
scribed previously in the section entitled Stimulus 
Materials. Both correct, incorrect, and incomplete 
Tesponses were included in the clustering ratios. 


RESULTS 


Measures of clustering, statements cor- 
rectly recalled, and errors were obtained 
for each of the three free-recall trials. Scores 
obtained from these measures were analyzed 
via a 2 X 2 x 3 analysis of variance with 
repeated measures (trials) on the last fac- 
tor. The results of these analyses are pre- 
sented separately below. 


Clustering 


Separate analyses of variance were made 
of the amount of clustering during recall 
based on names and on attributes. The 
analyses based on the concept name ratio 
yielded F — 3.59, df — 2/42, p « .05 for the 
effect due to passage organization. Mean 
elustering scores for the concept name, con- 
cept attribute, and random order groups 
were X — 7144, 42.71, and 53.27, respec- 
tively. Newman-Keuls multiple-comparison 
procedures were used to test differences 
among the means obtained from this anal- 
ysis as well as from others to be described 
below. According to this test, the concept 
name clustering ratio for the concept name 
group was greater than that for the concept 
attribute group (p < .05). A similar analy- 
sis of variance of the concept attribute 
clustering ratio yielded F — 5.54, df — 2/42, 
p « .01 for the effect due to passage organi- 
zation. The mean concept attribute scores 
for the three passage organizations were 
X = 20.00 for the concept name group, 
53.31 for the concept attribute group, and 
29.62 for the random order group. Concept- 
attribute clustering by the concept attri- 
bute group was greater than that of both 
the concept name (p < .01) and the random 
order groups (p < .05). 

The direction of the means in both anal- 
yses implied a negative relationship be- 
tween concept attribute and concept name 
clustering. When an individual adopts and 
uses one strategy, his clustering ratio on 
the other strategy is minimal. The overall 
correlation between the use of concept name 
and concept attribute strategies was r = 
—.84, df = 46, p < .01. Correlations between 
the obtained concept name and concept at- 
tribute clustering ratios for individual sub- 
jects within each (df = 15) of the three 
passage organizations were r = —.95 for 
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the concept attribute group, —.91 for the 
concept name group, and —.32 for the ran- 
dom order group. The large negative corre- 
lations indicate that subjects emphasized 
one clustering strategy to the exclusion of 
the other. The relatively low negative corre- 
lation obtained for subjects in the random 
order group as well as their low clustering 
scores on both concept attribute and concept 
name measures suggest that their recall was 
not as systematically organized as either 
of the other groups, thereby reflecting the 
lack of organization in the passage. 

Although each of the textual organiza- 
tions appears to have influenced the selec- 
tion of one or another strategy, the concept 
name and concept attribute strategies were 
not used to the same extent. In order to de- 
termine the dominance of one strategy, a 
t test of the difference between correlated 
means of concept name and concept attri- 
bute scores was made (McNemar, 1969, pp. 
113-114). This analysis yielded, ¢ = 3.01, 
df = 47, p < .01, implying that concept 
name strategies were preferred to concept 
attribute strategies. 


TABLE 2 
Moan CONCEPT ÁTTRIBUTE AND Concert NAME 
Scores ror EacH or THE NoTE-TAKING 
AND PASSAGE-ÜRGANIZATION 
CONDITIONS OVER TRIALS 


P Trials 
1 | 2 | 3 
Concept attribute scores 
Note taking R — 26.62119.50|19.00| 21.71 
CN  |31.8829.25/27.88| 29.67 


j CA  |42.1241.25/50.38| 44.58 
Reading only R 38.00/38.00/36.62| 37. 
CN | 6.2512.88/11.88| 10. 

CA  |59.00/52.12/75.00| 62. 


Raz 


Concept name scores 


Note taking R 74.62145.75/60.12| 60.17 
CN  |45.88/68.00/53.38| 55.75 
CA  |51.12/50.50|42.00| 47.88 
Reading only R 60.38/41. 12/37.62| 46.38 
CN |91.38/83.75|86.25| 87.12 
CA  |41.38/48.38/22.88| 37.54 


Note.—Abbreviations: R = random order; 
CN = concept name; CA = concept attribute. 
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The impact of the concept name and con- 
cept attribute passage organization on the 
clustering strategies of subjects who took 
notes and on those who did not was ex- 
amined in a separate analysis. The means 
for these groups are summarized in Table 2. 
Taking notes while studying appears to 
minimize the influence of passage organi- 
zation on strategy selection. Both the use 
of concept name clustering by the concept 
name group and of concept attribute clus- 
tering by the concept attribute group are 
influenced more by passage organization 
when students must rely mainly on memory 
than when notes were permitted. A compari- 
son of the mean concept name scores for 
the concept name subjects who took notes 
(X = 55.75) and those who did not take 
notes (X = 87.12), yielded t = 2.04, df = 
42, p < .05. A similar comparison of the 
concept attribute clustering scores for the 
concept attribute subjects who took notes 
(X = 44.58) and who did not (X = 62.04), 
was in the expected direction: however, the 
difference in this case was not reliable (t = 
1.20, df = 42, p > .05). 

It was expected that the concept name 
strategy would be immediately adopted by 
subjects who studied the concept name pas- 
sage, particularly when they were not per- 
mitted to take notes. Adoption of the con- 
cept attribute strategy by subjects studying 
that passage was expected to be gradual. In 
order to determine whether the effect of the 
Passage Organization x Trials interaction 
implied by this hypothesis was obtained, a 
clustering score was required which would 
reflect both concept name and concept at- 
tribute clustering factors. Accordingly, 8 
combined clustering score (CCS) was com- 
puted for each subject as follows: concept 
name ratio score — concept attribute ratio 
score + 100. This procedure resulted in a 
range of scores from 0 to 200. The upper 
limits of the range indicate complete con- 
cept name clustering; the lower extremes 
suggest complete concept attribute cluster- 
ing. Scores approximating the midpoint of 
the range imply that neither strategy was 
consistently adopted. 

The combined clustering scores across 
trials for subjects in the concept attribute 
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rganized by concept name (CN) and by 


concept attribute (CA) in the note-taking and reading-only conditions depicting the Passage 


Organization X Trials interaction. 


and concept name groups are displayed in 
Figure 1. In the reading-only condition, 
Subjects reading the concept name passages 
achieved high combined clustering scores in 
the early trials. On the other hand, no strat- 
egy preference was reflected in the combined 
clustering scores for subjects reading the 
Concept attribute passages during the first 
two trials; the concept attribute strategy 
Was not adopted until the third trial. A test 
of the interaction implied by these means 
Was made by comparing combined cluster- 
In£ scores for the concept name and concept 
attribute group for the second and third 
trials as follows: 


Y (Games, 1971). 
V. MSat) 22 E 


This analysis yielded t = 2.02, df = 84, p 
< .05. Because of the sharp drop in com- 
bined clustering scores for the concept at- 
tribute group between Trials 2 and 3, a 
further analysis was conducted by compar- 
ing the mean concept attribute clustering 
scores across trials for the concept attribute 
group without notes. The difference in clus- 
tering between Trial 1 (X = 59.00) and 
Trial 2 (X = 52.12) was not significant (p 
> .05). However, the analysis of the dif- 
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ference between Trials 2 and 3 (X — 75.00) 
yielded t = 1.93, df = 84, p = .05 (for a 
two-tailed test with df = 80, t .05 = 1.99). 
The difference between the means suggests 
an early adaptation phase where relatively 
little consistent clustering occurs, followed 
by a relatively complete adoption of the 
concept attribute strategy. 


Recall 


An analysis of the effect of passage or- 
ganization on recall of correct statements 
yielded F = 6.86, df = 2/42, p < .005. The 
mean number of statements recalled for the 
three groups were as follows: X = 10.92 
for the concept name group, 10.73 for the 
concept attribute group, and 6.98 for the 
random order group. Comparisons among 
these means indicated recall was signifi- 
cantly greater in the concept name and 
concept attribute groups than it was in the 
random order group (p < .01), but the con- 
cept name and concept attribute groups 


TABLE 3 
Mean STATEMENTS Correctly RECALLED AND 
MEAN Errors FOR THE Concert NAME, 
CONCEPT ATTRIBUTE, AND RANDOM 


Grovups* 
B 
Trials 
taki P: e 
"Umen |g | — pz Total 
3 
Statements correctly recalled 

Note taking R | 4.00} 6.38] 9.62) 6.67 
CN | 6.00 | 11.25) 15.62) 10.96 

CA | 6.88 | 12.25) 16.50) 11.88 
Reading only | R | 4.62 | 8.25) 9.00) 7.29 
CN | 8.00 | 10.75) 13.88) 10.88 

CA | 5.25 | 10,62) 12.88] 9.58 

Errors 

Note taking R 2.88 | 3.62) 3.62) 3.38 
CN | 2.00} 3.38} 2.12) 2.50 

: CA |3.88| 3.75| 2.25| 3.29 
Reading only R 2.50| 2.12| 3.50| 2.71 
CN | .75| 1.50 1.25| 1.17 

CA |3.12| 2.00  .88| 2.00 


Note.—Abbreviations: R = random order; 
CN = concept name; CA = concept attribute. 

a In the reading-only condition, the concept- 
attribute group recalled less statements and 
committed more errors on the first trial than did 
the concept-name group. 


did not differ significantly from each other 
(p > .05). Although there were no signifi- 
cant effects due to note taking (F < 1.00), 
the analysis yielded F = 74.08, df = 2/84, 
p < .001 for the effect due to trials. 

The subjects who studied the concept 
attribute passage had to either organize the 
passage to a concept name pattern or to 
relinquish their presumably favored con- 
cept name clustering strategy and adopt 
one consistent with the concept attribute 
organization externally imposed on the pas- 
sage. When the change in clustering strategy 
was not facilitated by an “external device” 
such as notes, it was expected that shifting 
strategies would interfere with early at- 
tempts to recall the contents of the passage. 
The learning curves for the concept name 
and concept attribute groups are summa- 
rized in Table 3. On the first trial of the 
reading-only condition, recall by the con- 
cept attribute group was depressed relative 
to recall by the concept name group. In con- 
trast, learning curves for the concept name 
and concept attribute groups in the note- 
taking condition were parallel. However, 
the Passage Organization X Trials interac- 
tion for correct statement recalled by the 
reading-only group was not significant (p 
> 10). 

The number of statements recalled cor- 
rectly was not related to either clustering 
via concept name strategy (r = —.03) or 
the concept attribute strategy (r = —.04). 
A score comprised of the largest clustering 
ratio for each trial was moderately but 
reliably related to recall (r = .33, df = 46, 
p < .05). 


Errors 


Analyses were made of the incorrect 
statements included in free recall. This 
analysis yielded F = 1.74, df = 2/42, p < 
.10 for the effect due to passage organiza- 
tion and F = 4.13, df = 2/42, p < -025 for 
the effect due to note taking. The latter ef- 
fect implies that more errors were made 
with notes (X = 3.06) than without notes 
(X = 1.96). 

The effect due to the interaction of Pas- 
sage Organization X Trials yielded F = 
3.84, df = 2/84, p < .01. This interaction 


5 


EFFECTS OF PASSAGE ORGANIZATION AND NOTE TAKING 251 


was marked by the general tendency of er- 
rors to increase across trials for the ran- 
dom order group (X — 2.69 for Trial 1, 
2.88 for Trial 2, and 3.56 for Trial 3) and 
to decrease across trials for the concept at- 
tribute group (X — 3.50 for Trial 1, 2.88 for 
Trial 2, and 1.56 for Trial 3). Since the 
learner’s preoccupation with acquiring a 
new clustering strategy could result in an 
increase in errors as well as lower recall 
scores, a specific test of the effect of the in- 
teraction between Trials (first and third) x 
Passage-Organization (the concept name 
and concept attribute groups) was made 
separately for the note-taking and reading- 
only conditions. The means for these groups 
are summarized in Table 3. This analysis 
yielded t = 1.50, df = 84, p < .10 for the 
note-taking group and, t = 2.34, df = 84, 
pP < .05 for the reading-only condition. 
Much of the interaction effect in the con- 
cept attribute reading-only group appears 
to be accounted for by the drop in errors 
from the first trial (X — 3.12) to the third 
trial (X = .88). A comparison between 
these means yielded, t = 2.70, df = 84, 
p< .01. 


Discussion 


It is clear that passages consisting of or- 
ganized sets of sentences resulted in more 
clustering and recall than randomly ordered 
Sets of sentences. These findings are con- 
sistent with results obtained by Cofer et al. 
(1966) who used word lists, and by Frase 
(1969) who used simple sentences. Thus, 
recall was facilitated when input was or- 
dered in a manner where words or sentences 
were conceptually parallel in the sense that 
Sentences dealt with the same category. 

A more interesting finding, however, was 
that each of the passage organizations in- 
fluenced the selection of a clustering strat- 
egy by subjects who studied them. In the 
Case of the passages organized by name or 
attribute, subjects identified organizational 
Cues from the passage and incorporated 
these cues in the acquisition or selection of 
a clustering strategy. In the case of the pas- 
Sage in which sentences were randomly or- 
dered, the organizational cues were not 
Immediately apparent. Accordingly, the 


subject must develop his own strategy, per- 
haps, as Frase has suggested, at the expense 
of learning the statements. 

The present findings also led us to imply 
that regardless of the passage organization, 
the strategy of clustering by names was 
more dominant than the strategy of cluster- 
ing by attributes, The subjective organiza- 
tion of materials by concept name may 
have been favored because it required the 
least amount of change from sentence to 
sentence and permitted relatively direct 
classification of information. Since the nam- 
ing or labeling element of each sentence 
within a passage externally organized by 
concept name remains the same within a 
given paragraph, only the value of the at- 
tribute changes. By using the concept name 
clustering strategy, the subject’s task be- 
comes comparable to learning six short se- 
rial-learning lists, each of which is com- 
prised of a set of concept attribute values 
associated with a particular concept name 
as represented in the six columns of Table 
1. In this regard it is interesting to note that 
subjects who used the concept name strat- 
egy often substituted ditto marks or the 
pronoun, it, for the concept name, suggest- 
ing a process resembling serial learning. 

In the passage externally organized by 
concept attribute, both the concept name 
and concept attribute elements differed from 
sentence to sentence, thereby requiring the 
acquisition of separate associations by the 
learner for each sentence. Thus, the concept 
attribute passage resembles a paired-asso- 
ciate task in which the same set of stimulus 
terms is paired with different response terms 
in each paragraph. As a result, the concept 
name clustering strategy may have been 
preferred because it was more efficient; in 
a sense, it required fewer associations for 
learning the same passage than did the con- 
cept attribute clustering strategy. 

The concept name clustering strategy 
may also have been preferred because it 
tended to be more frequently employed in 
written materials. Thus, we are saying that 
experience in the culture may favor the 
use of the concept name strategy. Neverthe- 
less, whatever the reason for its selection, 
the concept name clustering strategy ap- 
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pears to have been adopted by subjects 
studying the concept name passage with- 
out notes, and maintained at a high level 
throughout their efforts to learn the passage. 
In fact, the concept name group organized 
so completely via the concept name strat- 
egy that there was very little use of the 
concept attribute clustering strategy. This 
finding suggested that little interference 
between the two clustering strategies oc- 
curred for the concept name group. In con- 
trast, the concept attribute strategy by sub- 
jects in the concept attribute group was 
adopted during the third trial; that is, only 
after some experience with the passage. 
During the earlier trials, the concept attri- 
bute clustering strategy was matched by an 
almost equally high use of the concept name 
clustering strategy. Thus, the concept name 
strategy appeared to compete with the adop- 
tion of the concept attribute strategy for 
the concept attribute group. 

These findings suggested that the sub- 
ject had a dominant clustering strategy 
which was gradually relinquished when he 
found it was not as appropriate as a subor- 
dinate strategy for learning a particular pas- 
sage. He may be depicted, for illustrative 
purposes, as being in a trade-off situation 
in which he must weigh the problem of 
abandoning his preferred clustering strategy 
against the difficulty of reorganizing the 
passage. Apparently when note taking was 
not permitted, the task of reorganizing the 
passage to fit the subject’s preferred strat- 
egy increased memory strain. He opted to 
relinquish the dominant strategy in favor 
of a subordinate clustering strategy more 
consistent with the organization of the pas- 
sage as presented to him. 

These findings implied that learners test 
clustering strategies for their effectiveness 
against the task requirements (Restle, 
1962). When a strategy meets those require- 
ments, it is retained; when it fails to meet 
those requirements it is rejected in favor of 
another which is sampled from a “pool” of 
strategies associated with similar tasks. 
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From an instructional standpoint, the im- 
portance of giving serious attention to the 
way in which material is organized is all too 
apparent from these findings. 

The effect of taking notes while studying 
is important in several respects. Because 
note taking can facilitate reorganization, it 
may change the balance in the trade-off 
between abandoning the preferred cluster- 
ing strategy and reorganizing the passage, 
rendering the reorganization alternative 
more attractive. When permitted to take 
notes, subjects did not appear to reduce the 
inconsistency between the organization im- 
plicit in the passage and their preferred 
clustering strategy by abandoning or modi- 
fying their strategy. Rather, they “modi- 
fied” the passage to suit their own organi- 
zational schemes. From an_ instructional 
standpoint these findings point to the im- 
portance of giving serious attention to the 
way material is organized when students 
are not permitted to take notes. It is ap- 
parent that further research on efficient 
modes of organization may be a potentially 
fruitful endeavor. 
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SOME SOCIAL-EMOTIONAL CONSEQUENCES OF EARLY 
INADEQUATE ACQUISITION OF READING SKILLS 


OREN GLICK* 
University of Puget Sound 


This study investigated the relationship between early failure in 
reading and subsequent changes in (a) general and academic self- 
concepts, (b) attitudes toward school, (c) perceived parent behavior, 
and (d) classroom peer relationships. Residual change scores were cal- 
culated on "good" and "poor" readers from 10 third-grade class- 
rooms. Significant relationships between initial reading levels and 
social-emotional changes were observed. In general, poor male read- 
ers incurred negative consequences while little benefit accrued 
to good male readers. In contrast, benefits accrued to good female 
readers but negative consequences were not incurred by poor female 
readers. It was concluded that early academic performance has con- 
Sequences in the social-emotional domain which perpetuate and 
generalize patterns of success for females and failure for males. 


Fitzsimmons, Leonard, and Macunovich 
(1969) reported an intensive retrospective 
study of 270 pupils who had serious per- 
formance difficulties in high school or who 
had dropped out of school. They found that 
50% of their sample had experienced severe 
failure as early as the second grade, that 
“reading” was the most frequent origin of 
failure, and that the predominant pattern of 
failure was one of initial failure in one or 
two areas that in subsequent years “spread” 
to other areas. By the third grade, “two- 
thirds of the students who would have de- 
veloped spread patterns in future years had 
their first failure, and these had their origins 
in English [p. 139]." 

The results of the Fitzsimmons et al. 
study dramatize the need for increased in- 
Quiry into the processes initiated by early 
pokes 

`The assistance of the following organizations 
and persons in the conduct of this research is grate- 
fully acknowledged: The Kansas City, Missouri, 
Publie School District, particularly John Hartely 
and Carl Thompson, principals of the two schools 
involved. The Institute for Community Studies 
Provided the necessary financial assistance, John 
White and Quentin Isely of the Institute assisted 
in data gathering analysis. 

Requests for reprints should be sent to Oren 
Glick, School of Education, University of Puget 
‘ound, Tacoma, Washington 98416. 


failure which apparently perpetuate and 
generalize the failure during subsequent, 
years. What are the possible consequences 
of early school failure that might serve to 
perpetuate the same? Data available from a 
previous study (Glick, 1969a) provided a 
unique opportunity to investigate some of 
the immediate consequences of early read- 
ing failure on selected events in the social- 
emotional domain. i 
Failure is likely to have negative conse- 
quences in such areas as the self-concept, 
attitudes towards school, peer relations, and 
family relations. Conversely, academic suc- 
cess may be expected to have positive con- 
sequences in these areas. Such consequences 
may contribute favorably or unfavorably 
to subsequent academic performance, thus 
generating an on-going system of interde- 
pendent relationships between academic 
performance and the social-emotional do- 
main. D 
The present study consists of a series of 
comparisons between two groups of third- 
grade children, one of which performed be- 
low and the other at or above the expected 
reading norm on measures taken at the be- 
ginning of the third grade. Data of the 
following kind were obtained from these 
pupils: (a) general self-concept; (b) aca- 
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demic self-concept; (c) attitudes toward 
teachers; (d) attitudes toward school work; 
(e) attitudes toward peers; (f) attitudes 
toward school in general; (g) 15 scales of 
perceived parental behavior; and (h) four 
attributes of classroom peer relations. All 
data were gathered at the beginning and 
at the end of the third grade, thus pro- 
viding a measure of change in the course 
of the third grade. The question is: Did 
pupils who were manifesting difficulty in 
reading at the beginning of the third grade 
change differently in the measured areas 
from those whose reading was satisfactory? 


MzrHoD 


The data were obtained from 10 third- 
grade classrooms in two schools of a Mid- 
western metropolitan public school system. 
Both schools served similar all-white mid- 
dle-lower middle socioeconomic commu- 
nities. Approximately 140 male and 130 fe- 
male pupils were involved though this 
number varies among the variables in the 
analyses due to the requirement that each 
pupil provide satisfactory data at both the 
beginning and end of the third grade. 


Data Gathering Instruments 


Reading assessment. Pupils who performed at or 
above the 3.0 grade equivalent score on one or more 
of the three portions of the Metropolitan Achieve- 
ment Test in reading were classed as “good” read- 
ers. Those who performed below the 3.0 grade 
equivalent score on all three portions of the test 
were classed as “poor” readers. 

Self-concept. The self-concept was measured by 
means of an adaptation of the Brookover, Erick- 
son, and Joiner (1967) Self-Concept of Ability 
developed at the secondary-school level. The in- 
strument was modified for use at the primary level 
and was extended to include assessment of general 
attributes (10 items) as well as academic ability 
(6 items). 

School attitudes. Pupil attitudes toward school 
were assessed by means of the Pupil Opinion 
Questionnaire developed in the Kansas City Youth 
Development Project (Glick, 1967). This was a 60- 
item Likert-type scale that measured attitudes to- 
ward teachers, school work, peers, and school in 
guisa Examples of items from each component 
ollow. 


"Teachers: 
1. Teachers really do not understand children. 
2. Teachers expect too much of pupils. 
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School work: 
1. Pupils have to keep reading and studying the 
same things over and over in school. 
2. My daily school work is full of things that 
keep me interested. 
Peers: 
1. Most of the pupils in my class are friendly to- 
wards each other. 
2. It is hard to make friends in school. 
School in General : 
1. Most pupils would be better off if they never 
went to school. 
2. Most things about school are all right. 


A 5-point response scale was used ranging from 
strongly agree to strongly disagree. 

Parental behavior. Perceptions of mother’s be- 
havior were assessed by means of a modification of 
Schaefer’s (1965) Child’s Report of Parental Be- 
havior, The original instrument was reduced both 
in terms of numbers of dimensions and items. The 
following 15 dimensions were scored: acceptance, 
child centeredness, possessiveness, rejection, con- 
trol and enforcement, positive involvement, in- 
trusiveness, control through guilt, hostile control, 
inconsistent discipline, nonenforcement, lax disci- 
pline and extreme autonomy, acceptance of indi- 
viduation, instilling persistent, anxiety, hostile de- 
tachment, and withdrawal of relations. 

Peer relationships. A modification of the Syra- 
cuse Scales of Social Relations (de Jung & Haring, 
1962) was used to assess pupils’ interpersonal rela- 
tionships within the classroom. Each pupil was 
scored in terms of the average rating he made of 
other same- and opposite-sexed pupils and the av- 
erage rating he received from other same- and op- 
posite-sexed pupils on a need succorance criterion. 

Analyses. On each variable, posttest scores were 
regressed on pretest scores. From the regression 
equation, posttest scores were predicted for each 
pupil. Each pupil was classified according to 
whether his observed posttest score was higher 
than (favorable and unfavorable change for posi- 
tive and negative content scales, respectively) oF 
lower than (unfavorable and favorable change for 
positive and negative content scales, respectively) 
his predicted posttest score. For each variable, and 
separately for male and female pupils, ehi-squares 
were computed on the frequencies of favorable and 
unfavorable change for pupils scoring below (poor 
readers) and above (good readers) the reading 
norm at the beginning of the year. 


RESULTS 


Table 1 presents the frequencies of favor- 
able and unfavorable residual change scores 
and results of the chi-square analyses. 


Self-Concept 


Good male readers were more likely to 
have favorable than unfavorable changes 1n 
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TABLE 1 


FREQUENCY ANALYSES OF FAVORABLE (+) AND UNFAVORABLE (—) SOCIAL-EMOTIONAL 
Cuances FOR GooD AnD Poor READERS BY SEX 


= Males Females 
Social-emotional dimensions | Good Poor {Good Per 
x x5 xt x% 
+|- +|- +|- +|- 
Self-concept 
General 56 | 29 |8.58***| 18 | 30 | 3.00 44 | 58| 1.92 14|11| .36 
H Academic 53 | 32 |5.19* | 20 | 26 8 47 | 55 .63 15 | 10 | 1.00 
PE Pupil attitude 
Teacher 43 | 42 | .01 13 | 34 | 9.38*** | 62 | 40 475 15 | 10 | 1.00 
General 42 | 43 | .01 17 | 30 | 3.60 71 |81 | 15.69**** | 13 | 12| .04 
School work 46 | 39 | .58 24 | 23 .02 61 | 41 | 3.92* 13 | 12| .04 
Peers 43 | 42 | .01 16 | 31 | 4.79* 60 | 42 | 3.18 12| 13| .04 
Child Report of Parental 
Behavior 
Acceptance 43 | 31 |1.95 19 | 21 .10 59 | 29 | 10.23*** | 10 | 10 | 0.00 
Child centeredness 35 | 39 | .22 18 | 22 -40 48 | 40 -78 12| 8| .80 
| Possessiveness 41 | 33 | .86 19 | 21 10 49 | 39| 1.14 10| 10| .00 
i Rejection 38 | 36 | .05 | 20} 21 .02 54 | 34 | 4.55* 13| 7|1.80 
Control and enforce- 
| ment 42 | 32 |1.35 23 | 17 .90 43 | 45 .05 11| 9| .20 
| Positive involvement 39 | 35 | .22 21 | 19 10 47 | 41 E: 9} 11} .20 
| Intrusiveness 37 | 37 |0.00 18 | 22 .40 45 | 43 .05 10 | 10 | 0.00 
| Control through guilt | 35 |39| .22 |14|26| 3.60 45 | 43 05 5 | 15 | 5.00* 
| Hostile control 3835 .12 |12|29| 7.05** |48 |40| .73 |u] 9| .20 
Inconsistent discipline | 38 | 36 | .05 | 19 | 20 .03 57|31| 7.68*** |13| 7|1.80 
Nonenforcement 34 | 40 | .49 19 | 21 .10 58 | 30 | 8.91*** | 10 10 | 0.00 
37 | 37 0.00 | 23 | 17 .90 47 | 41 AL 7| 18 | 1.80 
Instilling anxiety 36 | 88 | .05 |13 |27| 4.90* | 52/36) 2.91 uj 13| 7|1.80 
Hostile detachment 39 | 35| .22 |16 |24| 1.60 59 | 29 10.23% 11| 9| .20 
| Withdrawal 37 | 37 |0.00 15|24| 2.08 56 | 32 | 6.55* 12| 8| .80 
Syracuse Scale of Social 
Ree one 
ean rating made (op- 
Posite Mis cu 46 | 42 | .18 |34|17| 5.67** |47 | 56 -63 14 | 18 | .04 
ean rating received 
EITTERA 41 | 47 | .41 27 | 24 .18 61 | 51 | 0.00 10 | 17 | 1.81 
ean rating made 
& (same sex) 38 | 50 |1.64 |22|29| .96 | 58/44] 1.92 16 | 11 | .98 
ean rating recei 
(same sex) ore 45 | 43] .05 | 14 | 37 | 10.37*** | 72 | 30 | 17.29**** | 10 x i 
Reading achievement 483|47| .18 |13|13]| 0.00 | 82] 29} .08 20 : 
4 P X 05. 
ee < .02. 
Fo P< .0l. 
p < .001. 


Note.—The numbers in the plus and minus columns specify the number of pupils showing favorable 
and unfavorable change, respectively, on the measured variable. 


both general (x? = 8.58, p < .01) and aca- Pupil Attitudes 


demi = -concept. 

ves docs em herren In attitudes toward teachers eem one 
cantly from good male readers on either poor male readers were free y d 
self-concept measure. No significant differ- unfavorable than fayorab e cl pneg e 
ences were noted for females. ers: x? = 938, p < .01; peers: x? = 4.79, 
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p « .05). Differences in the same direction 
for attitudes toward school in general were 
not significant (xè = 3.60, p < 10). Atti- 
tudes toward school work changed fav- 
orably and uníavorably with equal fre- 
quency. There was no tendency for good 
male readers to show íavorable attitude 
change more frequently than unfavorable 
attitude change on any of the attitude com- 
ponents. 

The results for females were quite dif- 
ferent. Poor female readers showed favor- 
able and unfavorable change with about 
equal frequency on all four attitude com- 
ponents. But good female readers were 
much more likely to have favorable than 
unfavorable change (teachers: x? = 4.75, 
p < .05; school in general: x? = 15.69, p < 
001; school work: x? = 3.92, p < .05). 
Differences in the same direction for atti- 
tudes toward peers were not significant (x? 
= 3.18, .05 < p < .10). 


Child’s Report of Parental Behavior 


None of the differences between favor- 
able and unfavorable change for good male 
readers approached significance. Changes 
on 8 of the 15 dimesions were in a favor- 
able direction for good male readers. In 
contrast, 12 of the 15 comparisons for poor 
readers were in an unfavorable direction, 
three approaching or reaching significance 
(hostile control: x? = 7.05, p < 01; in- 
stilling anxiety: x? = 490, p < .05; con- 
trol through guilt: x? = 3.60, p < .10). 

The results for females again contrasted 
with those for males. Favorable changes 
occurred on 14 of 15 comparisons for good 
readers, 7 reaching or approaching signifi- 
cance (acceptance: x? = 10.23, p < 01; re- 
jection: X* = 455, p < .05; inconsistent 
discipline: x? = 7.68, p < .01; nonenforce- 
ment: y? = 8.91, p < .01; hostile detach- 
ment: y? = 10.28, p < .01; withdrawal: x 
= 6.55, p < .02; instilling anxiety: y? = 
2.91, p < .10). Only three of the dimensions 
showed unfavorable change in the case of 
poor female readers, one of which was sig- 
nificant (control through guilt: 32 = 5.00, 
p < 05). 
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Peer Relationships (Syracuse Scale of Social 
Relationships) 


No significant differences between favor- 
able (increased mean ratings) and unfavor- 
able (decreased mean ratings) change oc- 
curred for good male readers. For poor male 
readers, decreased mean ratings received 
from the same sex were more frequent 
than increased mean ratings (x? = 10,37, 
p < .01), while increases in mean ratings 
made of the opposite sex were more fre- 
quent than decreases (y? = 5.67, p < .05). 
Again, in contrast, no significant differences 
were observed for poor female readers while 
good female readers were much more likely 
to receive higher than predicted mean rat- 
ings from the same sex (x? = 17.29, p < 
001)? 


Discussion 


The results show a relationship between 
reading status at the beginning of the third 
grade and changes during that grade in 
certain social-emotional characteristics. The 
nature of the relationship is typically dif- 
ferent for males than for females. In gen- 
eral, negative consequences are incurred 
by poor reading males while few social- 
emotional benefits accrue for good reading 
males. Females, in contrast obtain social- 
emotional benefits from being good readers 
but incur few negative consequences if they 
are poor readers. An exception appears in 
the case of the self-concept wherein, for 
females, there is no relationship between 
the initial reading level and subsequent 
self-concept change. Enhanced self-concepts 
follow for males who are initially good 
readers while the self-concepts of initially 
poor readers show some evidence of suffer- 
ing loss. t 

Had males and females shown similar 
patterns of results one might suspect that 
the data reflect only high and low IQ corre- 
lated behavior) The contrasting results 


*See Glick (1969b) for an analysis of made and 
received peer ratings within the context of a theo- 
retical model of person-group relationships. 

*IQs were available for only about 70% of the 
subjects. Hence an analysis controlling on IQ could 
not have been done without discarding a great 
amount of data. 
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for males and females render an IQ ex- 
planation highly implausible. Such an ex- 
planation would require the positing of an 
interaction between IQ and sex in relation 
to changes in the social-emotional variables 
such that the effects of IQ would generally 
be opposite for boys than they would be for 
girls. 

The pattern of sex differences suggests 
that responses to early reading performance 
by educators, parents, and peers are dif- 
ferent, for males than for females. Perhaps 
these results reflect differing emphases in 
the techniques of socialization for males and 
females in our society: Control and shap- 
ing of male behavior may be primarily by 
means of negative sanctions for undesirable 
behavior while female behavior may be 
molded primarily by means of positive 
sanctions for desirable behavior. The spe- 
cific changes in the children’s perception 
of mother’s behavior are consistent with 
this interpretation. The fact that changes in 
mean ratings received from like-sexed peers 
showed the typical pattern of sex differences 
suggests that both male and female third 
graders, in accordance with the general 
norm, may already by responding differen- 
tially to male and female success-failure 
patterns. 

The self-concept results, however, do not 
appear consistent with the suggested sociali- 
zation paradigm, particularly since the ef- 
fects here were most pronounced for the 
general self-concept which, presumably, 
Would be relatively more affected by social 
feedback while the academic self-concept 
might be more affected by academic per- 
formance per se. 

The existence of a relationship between 
academic performance and subsequent 
changes in social-emotional characteristics 
Suggests that educators are in fact directly 
involved in the social-emotional as well as 
the intellectual development of pupils re- 
Eardless of their philosophical position on 
the question. The present findings regarding 
Sex differences are consistent with the com- 
mon sense notion that school is more a 
Eirl's world than it is a boy's world. There 
is further need for documentation of the 
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ways in which the school, family and 
community social systems respond differ- 
ently to patterns of academic success and 
failure among males and females and what 
the consequences of those responses are for 
the total development of the child. It ap- 
pears that we are creating social environ- 
ments for boys in our society which offer 
punishment for inadequate school perform- 
ance but relatively little support or rein- 
forcement for good school performance. 
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“In addition to the present Child's Report of 
Parental Behavior data, a previous study showed 
that, in the case of academically failing boys, moth- 
ers attributed responsibility for failure to the child 
and his characteristics while, in the case of aca- 
demically failing girls, responsibility was attributed 
to characteristics of the school (Glick, 1970). 
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A CROSS-LAGGED PANEL ANALYSIS! 


WILLIAM D. CRANO* 
Michigan State University 

DAVID A. KENNY Aux» DONALD T. CAMPBELL 
Northwestern University 


The literature of cognitive development has produced two opposing 
models of mental growth. One holds that the acquisition of concrete 
mental skills causes the later development of higher order organiza- 
tional schema or rules. The contrasting model postulates a progression 
in which the initial acquisition of larger schema results in the increased 
capacity to acquire new concrete skills. While both probably operate 
to some extent, an attempt was made in this research to determine 
the preponderant developmental sequence. The scores of 5,495 students 
who had taken intelligence and achievement tests in both fourth and 
sixth grades were analyzed through the use of the cross-lagged panel 
correlation technique. For students of suburban schools (N = 3,994), 
the abstract-to-concrete causal sequence predominated, while among 
inner-city school children, the opposite held. The specific causal re- 
lationships between skills assessed on the various subscales of the tests 
employed, the value of the cross-lagged panel correlation technique 
in causal analysis, and an extensive methodological examination and 
qualification of this analytic model are presented. 


The original impetus for the development 
of intelligence tests was provided by the call 
for a diagnostic tool with which to discimi- 
nate between normal and retarded children. 
Research was therefore focused upon the 
problems of measurement, not theory build- 
ing. In one of their early statements, for ex- 
ample, Binet and Simon (1905) specifically 
avoided any speculation concerning the pos- 
sible relationships between social or physio- 
logical variables and intelligence. Their task, 
88 they envisioned it, was one of measure- 
ment, not speculation. 

Even in Binet/s time, however, scientists 
were not content to address themselves solely 


1 This research was supported by National Sci- 
ence Foundation Grant GS-1309X. We wish to ex- 
press our gratitude to Joel Aronoff, Hiram Fitz- 
gerald, and Nancy Hammond for their assistance at 
various phases of this investigation. 
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D. Crano, Department of Psychology, Olds Hall, 
Michigan State University, East Lansing, Michigan 
48823. 


to the still-to-be-resolved problems of in- 
telligence measurement. Considered of 
greater importance were questions about the 
relationship between intelligence and 
achievement, and whether one of these fac- 
tors was in some way responsible for the 
generation or development of the other. To 
many of the early psychologists, two possi- 
bilities were immediately apparent: first, that 
intellectual advancement was a function of 
an organism’s progression from the acquist- 
tion of concrete specific skills to the genera- 
tion of higher order abstract rules (which we 
Shall define as intelligence), contrasted with 
the view that the ability for abstract thought 
was a constant quality whose development 
was facilitated through the organism’s inter- 
action with the environment. If we might 
use the terms intelligence and achievement 
loosely, the problem could be restated in the 
following way: Does the acquisition of spe- 
cific skills or the learning of specific informa- 
tion (achievement) result in an increased 
ability for abstraction (intelligence), or is the 
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progression more accurately described as one 
in which intelligence causes achievement, 
that is, does the greater ability to form ab- 
stractions result in a greater amount of 
concrete information being absorbed and re- 
tained? 

It would be a mistake to view these possi- 
bilities as being mutually exclusive. Quite 
possibly, the causal sequence might operate 
in both directions, with the acquisition of 
concrete specific skills causing the develop- 
ment of higher order abstract rules which in 
turn give rise to yet additional concrete ao- 
quisitions. While this reciprocal dependence 
might well operate, one sequence may pre- 
dominate. The primary focus of this report 
is the investigation of the preponderant 
causal sequence. 

The question of preponderance is not at 
all novel. As Thorndike (1903) observed, 


A human being is... the sum of an original nature 
acted on by antenatal influences and the later en- 
vironment. The first problem of educational sci- 
ence concerns the relative shares of these agencies 
in determining human thought and conduct [p. 40]. 


Although scientists have confronted the 
question of preponderance or “relative 
shares” since Thorndike’s time, they have 
lacked the necessary analytic tools to resolve 
it. Despite this fact, the social and political 
importance of this question has forced many 
Scientists into premature speculation con- 
cerning its probable solution. Thus Galton 
(1892), responding within the historic and 
social confines of 19th century England, 
would state long before the development of 
even marginally reliable intelligence tests, 


Thave no patience with the hypothesis occasionally 
expressed, and often implied, especially in tales 
Written to teach children to be good, that babies 
are born pretty much alike, and that the sole agen- 
Cles in creating differences between boy and boy, 
and man and man, are steady application and 
moral effort. It is in the most unqualified manner 
that I object to pretensions of natural equality. ... 
I acknowledge freely the great power of educational 
and social influences in developing active powers of 
the mind, just as I acknowledge the effect of use 
in developing the muscles of a blacksmith's arm, 
and no further [p. 12]. 


„Galton saw clearly the importance of en- 
Vironmental influences in the development 
of potential; inherited mental “powers,” 
however, were seen to be the preponderant 


259 


causal component in the intelligence-achieve- 
ment sequence. This view, influenced un- 
doubtedly by Galton's cousin, Charles Dar- 
win, has been forcefully defended in today's 
psychology by Cyril Burt (1944, 1949), 
among others. 

As partial proof of the importance of 
genetic inheritance in the determination of 
intelligence differences, scientists of this per- 
suasion point to the impressive volume of 
studies demonstrating that animals of greater 
problem-solving  acuity, greater speed, 
greater longevity, etc., can be obtained 
through a carefully controlled breeding 
process. 

Even more pertinent are the results of 
numerous studies of twins reared separately. 
If the concrete-to-abstract causal sequence 
predominated, then the intelligence or 
achievement test scores of twins assigned at 
random to different learning environments 
would not be expected to correlate beyond 
chance levels. Erlenmeyer-Kimling and 
Jarvik’s (1963) review of the last 50 years of 
twin studies, however, demonstrates that 
this expectation is clearly in opposition to 
the obtained results. The relationship be- 
tween the intelligence test scores of twins 
reared apart has been consistently greater 
than that of test scores of siblings reared 
apart, and clearly stronger than that ob- 
tained between scores of nonrelated persons.’ 

Other scientists, of course, find it more 
worthwhile to emphasize the importance of 
environmental influences over genetic fac- 
tors. Piaget (1950, 1952), for example, recog- 
nized the fundamental importance of inborn 
processes (“elementary sensorimotor mecha- 
nisms,” or “reflexes”), but relegated to them 
a relatively minor role in the determination 
of intelligence and achievement in the nor- 
mal child. These reflexes (sucking, grasping, 
orientation to light, arm waving to strong 
stimulation, etc.), ubiquitous in all but the 
most severely physically retarded child, 


$ interpretation of these findings must be 
ms. in light of the fact that the twins studied 
in the investigations reviewed by Erlenmeyer- 
Kimling and Jarvik (1963) were not typically as- 
signed at random to different learning environ- 
ments, and thus, similar environmental influences 
might be at least partially responsible for similari- 


ties of test scores. 
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could hardly be made to explain the wide 
range of individual differences in evidence 
even among “normal” persons. Piaget con- 
ceptualized intelligence, or the ability to deal 
in abstractions, as a dynamic developmental 
phenomenon, rather than a fixed genetically 
inherited quality. Further gains in intelli- 
gence could be effected by the acquisition of 
specific skills, information, and rules which, 
with other concrete skills, information, and 
rules, combined in the formation of higher 
order, abstract, generalized principles (.e., 
intelligence). Piaget conceptualized a causal 
sequence in which the acquisition of specific 
skills (achievement) combined with other 
specific skills in generating more abstract 
cognitive rules (intelligence). 

Between the extremes of the genetic and 
environmentalist positions all shades of 
opinion are represented. The continuing 
controversy within this area serves to in- 
dicate that the fundamental question of 
preponderance of influence in the deter- 
mination of intelligence and achievement 
differences has yet to be resolved satisfac- 
torily. A consideration of current educational 
practices, however, would seem to belie 
this proposition. Each year throughout the 
United States, for example, hundreds of 
thousands of dollars are spent on the stand- 
ardized tests used in elementary schools. In 
these testing programs, there is a heavy in- 
vestment not only in intelligence tests, but 
also in achievement tests, which purportedly 
mark the progress of the student and the 
accomplishments of the educational process. 
The use of intelligence tests is based on the 
assumption that such instruments tap a di- 
mension distinct from. the one measured in 
the achievement test—that intelligence is a 
prerequisite for achievement. Intelligence 
tests are expected to measure better than 
past achievement, a student's potentialities 
for future achievement. If statements of a 
causal nature were to be made, intelligence 


* Other positions consistent with the concrete-to- 
abstract causal sequence were developed by Fer- 
guson (1954, 1956) and Hunt (1961), among others. 
Support for even the most radical environmentalist 
hypothesis could be drawn from the work of Scott 
(1964), who demonstrated the dramatic effects of 
early sensory deprivation upon the later develop- 
ment of a wide range of organisms. 
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would be seen as one of the causes (although 
possibly only one of many) of subsequent 
achievement. Clearly the reverse is not 
usually held among educational test special- 
ists. Rarely would one find advocated the 
thesis that present achievement is one of 
the causes of later intelligence scores. But 
if the usual assumptions are wrong—if so- 
called intelligence tests are just another 
(more generalized) form of achievement 
test—then important revisions in research 
policy would necessarily follow. Surely it is 
conceivable, given the continuing contro- 
versy surrounding the preponderance of 
causation issue, that the usual assumptions 
could indeed be wrong, but how is one to 
investigate the validity of these assump- 
tions? 

As discussed above, the probable cause of 
controversy regarding the preponderance of 
causation was not the ill will or small- 
mindedness of our scientific predecessors, 
but rather a lack of proper methodological 
tools with which to confront this issue. The 
principal drawback is that the question of 
preponderance is basically a correlational 
one. Assuming that achievement and in- 
telligence could be independently measured, 
the ideal study would examine the relation- 
ship between these two factors and jhe 
changes in their relationship over timeVThe 
word relationship here should be emphasized, 
as it clearly points up the correlational 
nature of this “ideal” investigation. Cer- 
tainly, other, more powerful statistical tech- 
niques have been employed in research of 
this type, but often with less than adequate 
justification (e.g., Hunt, 1961, has discuss 
the misuse of analysis of variance techniques 
in this field). For many years however, it 
has been the rule that, “correlation does not 
imply causation.” This old saw, bothersome 
though it has been, was nevertheless valid. 

With the recent development of the cross- 
lagged panel correlational technique, how- 
ever, inferring causal relationships on the 
basis of correlational results has become 
possible. (For a description of this technique, 
more extensive and technical than that to 
be presented in this report, see Campbell, 
1963; Campbell & Stanley, 1963; Pelz & 
Andrews, 1964; Rozelle & Campbell, 1969.) 
'This method is based upon one of science's 
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most useful rules of causal inference, that of 
time precedence: In every science, when a 
given event consistently precedes the occur- 
rence of another, and the reverse does not 
hold, one of only two possibilities is enter- 
tained: (a) Event 1 is presumed to be a 
eause (possibly only one of many) of Event 
2; or (b) both Event 1 and Event 2 are the 
effects of some more general cause(s). 

It is the aim of all experimental design to 
negate the possibility of the second alterna- 
tive. By controlling the application of the 
independent variable, the experimenter is 
assured that its occurrence was not depend- 
ent upon some more general prior event, 
and, thus, any differences occurring between 
experimental and control subjects can be 
attributed to the presence or absence of the 
independent variable. In this way, the 
second alternative (i.e., that Events 1 and 
2 are both effects of some more general 
cause) is rendered implausible (see Crano & 
Brewer, in press). 

But how does the concept of time prece- 
dence impinge in correlational investiga- 
tions? Clearly, correlational techniques can 
be employed to study the strength of a rela- 
tionship between variables, but no reliable 
causal estimate can be made from a single 
coefficient of correlation taken independ- 
ently. Suppose, however, that one had 
available correlational information relating 
two variables at more than one point in time. 
For the sake of later exposition, let us 
assume that the two variables of interest 
were individuals’ scores on achievement and 
intelligence tests, administered (approxi- 
mately) simultaneously, at least twice (say, 
2 years apart, in Grades 4 and 6), and that 
every possible relationship between these 


` scores had been calculated. The resulting 


matrix of correlations could be presented in 
the manner employed in Figure 1. 

_ On the basis of much prior experimenta- 
tion, we would expect the unlagged, syn- 
chronous correlations (i.e., rz444 Tres) and 
the lagged autocorrelations (i.¢., 3 
retest correlations r;,5,, T4,4,) to be quite 
large, if the tests employed were reliable. 
From the perspective of causal inference, 
however, the correlations and lagged 
Over time (i.e., Tras , Taare) provide informa- 
tion of eritical importance. 
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Let us consider the three possibilities aris- 
ing from a comparison of r;,4, with r4, 7, . If 
high intelligence test scores in Grade 4 are 
consistently followed by high achievement 
test scores in Grade 6, but the converse is 
not true (i.e., that high A, scores are not 
consistently followed by high J scores), then 
we would expect rr,4, to be greater than 
T4,1,. If, on the other hand, achievement: 
was the precursor of intelligence, then the 
pattern of correlational differences would be 
reversed (ie., Taare > 11445). AS was stated 
above, the presence or change in a variable 
(e.g., an intelligence test score) consistently 
followed by a change in status (either a gain 
or loss) of another variable (e.g., an achieve- 
ment test score) satisfies the time-precedence 
notion of causality. Thus, if rra, > Tayre 
(and if all other factors were constant) we 
could assume that the causal vectors were 
in the direction of intelligence causing later 
achievement. This would of course not rule 
out some type of reciprocal causation oper- 
ating as a feedback loop, with, for example, 
gains in intelligence causing later gains in 
achievement scores which in turn trigger 
later gains in intelligence, ete., but would 
rather demonstrate that the preponderance 
of causation was in the direction of intelli- 
gence causing later achievement. Such a 
finding would be an exciting confirmation 
of long-held but untested beliefs of the causal 
efficacy of intelligence in partially determin- 
ing achievement. j 

Tt is possible, of course, that no causal 
relationship exists between intelligence and 
achievement, or that both of these qualities 
are the effect of some more general causal 
influence. In either case, no differences be- 
tween the cross-lagged values would be 
expected (i.e, trás = Taar). Such a result 


would provide little justification for the 
IQ, "ig le laa 
TO" Ag 
fla. lgAg 
ULM 
mes TAa As Ache 


Fia. 1. Output presentation mode—schematic 
representation. 


262 


separate status of intelligence tests, in that 
it would not support the assumption that 
intelligence is a predictor of later achieve- 
ment in a way in which achievement itself 
is not. á 

A final outcome, that the cross-lagged 
difference was opposite that usually pre- 
dicted, that is, achievement better predicts 
later intelligence, and thus the classical no- 
tions of causality would be more correct if 
reversed, is also a possibility. This result 
(Le, rA, > Tr4,) would also be exciting, 
and one perhaps anticipated by psychologi- 
cal learning theory and recent formulations 
of intelligence as presented by Bruner (1966), 
among others. 

As discussed above, intelligence may be 
epitomized as an adaptive flexibility in re- 
Sponding to novel problems presented by 
the environment (and sampled in intelli- 
gence tests), while achievement is more 
directly related to the mastery of adaptive 
skills in dealing with familiar tasks (such as 
School subjects). Many studies have shown 
that habits and skills learned in specific 
settings generalize to more novel stimuli and 
Settings. When any novel task is presented, 
the repertory of available skills, hunches, 
and insights is greater if the pool of specific 
past achievements is large and diverse. If 
we conceive of such learning processes as 
continuing throughout the school years, then 
it follows that this year's specific learning 
achievements will generalize into next 
year's increased ability to solve novel prob- 
lems, that is, into next year's intelligence." 
Intelligence would thus be viewed as a very 
general distillate of past achievements. 

Regardless of one's theoretical stance, the 
cross-lagged panel correlation technique pro- 
vides a realistic choice among the three 
alternative causal possibilities noted above. 
Actions taken on the basis of the results ob- 
tained in this investigation will vary, prob- 
ably as a function of prior theoretical com- 
mitments, but unlike before, these actions 
will be at least partially grounded in or con- 
Strained by empirical evidence. 

Before describing the tests and subject 
sample employed in this investigation, a 
word of caution regarding the generalization 
of this analytic technique to other questions 
of a causal nature is in order, since the appar- 
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ent simplieity of this method can be de- 
ceiving. As Rozelle and Campbell (1969) 
noted, the cross-lagged panel correlation 
does not always enable an unambiguous de- 
cision between two competing causal hy- 
potheses, because, in fact, four competing 
hypotheses exist in situations of this type. 
Suppose, returning to the previous example, 
that riya, > TA,1,- Would this result neces- 
sarily imply that the preponderance of 
causal effects was in the direction of intelli- 
gence causing future achievement? It would 
not, unless some needed qualifications were 
first postulated. 

Of the four simple cross-causal relations 
that are possible between intelligence and 
achievement, we have assumed that two are 
so implausible that they can be disregarded. 
Specifically, we reject the two possible nega- 
tive relationships: high achievement, causes 
later intelligence losses (low achievement 
eauses later intelligence), and high intelli- 
gence causes a subsequent decline in achieve- 
ment (low intelligence causes high subse- 
quent achievement). 

In the present investigation, the possibil- 
ities involving a negative relationship be- 
tween intelligence and achievement , are 
extremely implausible, and for the moment 
will be deleted from the list of probable 
competing hypothesis. There is much em- 
pirical evidence supportive of this action. 
The results of numerous investigations, for 
example, have demonstrated that rarely, if 
ever, will achievement and intelligence 
scores be negatively correlated. Thus, we 
will oppose two of the four potential hy- 
potheses, without any undue concern re- 
garding the plausibility of the remaining 
two (The viability of this assumption will 
be examined in a later section of this paper). 

In many other investigative situations 
amenable to cross-lagged panel analysis, the 
degree of existing knowledge regarding the 
general relationship between the two vari- 
ables of interest is so restricted that none of 
the four competing hypotheses can_arbi- 
trarily be discarded. In Rozelle and Camp- 
bell’s (1969) study of the causal relationship 
of grades and class attendance, for example, 
three of the four possible competing hy- 
potheses were viewed as plausible, and two 
were “confirmed” in the judgment of the 
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investigators. In situations of this type (and 
the present study is not one of them) the 
eross-agged panel technique must be em- 
ployed with extreme caution (see Kenny, 
1970; Rickard, in press; Sandell, 1971). 

Assuming that the general relationship 
between intelligence and achievement is 
both positive and substantial, we may then 
proceed to a discussion of the tests and 
samples employed in the present investiga- 
tion. 

METHOD 


Sample 


The data on which the analyses were based were 
provided by the Board of Education of the Mil- 
waukee Public Schools. In Grades 4, 6, and 8, both 
intelligence and achievement tests are adminis- 
tered to all public school children. Within any 
given test year, the two tests are administered with 
a minimum of time lag between them. In the 
present investigation, relations between intelligence 
and achievement test scores of children attending 
fourth grade in the academic year 1963-64, and 
sixth grade 2 years later, were investigated. A total 
of 5,495 complete sets of data were collected. That 
ìs, 5,495 children who had (in 1963-64) completed 
both intelligence and achievement tests in their 
fourth year of elementary school and also, 2 years 
later, completed (parallel forms of) these tests in 
the sixth grade, comprised the subject sample. 


Tests 


Level three of the Lorge-Thorndike intelligence 
test (1957 version) was administered to the sample. 
All children in Grade 4 received the same form 
of the intelligence test. In the sixth grade, an alter- 
nate form of the Level 3 test was employed. In the 
Construction of this instrument, the authors at- 
tempted to generate tests aimed at the assessment 
of behavioral characteristics “which they would de- 
Scribe as intelligent [Lorge & Thorndike, 1957, p. 
12] The tasks purportedly dealt with the ability 
to employ abstractions and general concepts, en- 
tailed the interpretation, use, and recognition of 
the relationships among symbols, required flexibil- 
ity and the ability to employ novel patterns of 
Concepts, and, finally, focused upon power rather 
Tu Speed (see Lorge & Thorndike, 1957, pp. 12- 


This instrument consists of both verbal and non- 
Verbal batteries, Within the verbal battery were 
tasks that involved completion, verbal classifica- 
tion, arithmetic reasoning, and vocabulary. This 
test consisted of 90 items for which 34 minutes were 
alloted. The nonverbal battery (79 items, 27 min- 
utes’ administration time) was entirely pictorial or 
numeric, and consisted of tests involving pictoral 
classification and analogy, as well as numeric rela- 
tionships, 

The Iowa Tests of Basic Skills constituted the 
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achievement test battery administered to the sam- 
ple. Alternate forms of this test were employed in 
the fourth and sixth grade test administrations. 
These tests “provide for the measurement...of 
certain skills involved in reading, work-study, lan- 
guage, and arithmetic [Manual, Iowa Tests of Basic 
Skills, 1956]." This device consists of the following 
subscales: - " 

Vocabulary. The 38 items (46 for the sixth grade 
test) of this subtest consist of a stimulus word in 
context which the respondent is to match with one 
of four definitions provided. A total of 17 minutes 
is alloted for this test. 

Reading Comprehension. In this test, respond- 
ents are provided a selection to read, varying in 
length from a few sentences to an entire page. The 
function of this test is to determine the ability of 
the student to apprehend the meaning of the com- 
munication, to draw appropriate inferences, to 
grasp the significance of the information provided, 
ete. The fourth grade test consists of 68 items, the 
sixth grade, 76. Administration time for both is 55 
minutes. 

Language. This test consists of four separate 
subscales, concerned with spelling, punctuation, 
capitalization, and usage. The format of all items 
employed in these scales is similar. Respondents 
are presented with a series of stimuli, one of which 
might be in error. The task of the subject is to 
identify this error. In the Spelling subscale (38-46 
items, 12 minutes’ administration time), for ex- 
ample, four words are presented, and one of these 
may be misspelled.’ In both the Capitalization and 
Punctuation subscales, one or two sentences ex- 
tending over three lines of equal length are pre- 
sented. The respondent is to identify the line on 
which a capitalization or punctuation error occurs. 
Both the capitalization and the punctuation test 
consist of 39 (42) items; the former is adminis- 
tered in 15 minutes, while the latter is allocated 
20. Language Usage items consist of 3 sentences, 
one of which could contain a usage error, Tested 
on this subscale was the use of the pronouns, verbs, 
adjectives, and adverbs. In addition, the avoidance 
of double negatives and redundancies, commonly 
misused homonyms and miscellaneous word forms 
was investigated. In both grades sampled, this test 
consists of 32 items, with a 20-minute time allow- 


ance. 

Work-Study Skills. This test is composed of 
three subscales. The skills assessed in these tests 
“sre those which have been traditionally classified 
as ‘work-study’ skills and which are of crucial im- 
portance to self-education in out-of-school and 
postschool activities [Manual, 1956, p. 64].” The 
first of these instruments is concerned with Map 
Reading. Within this test, a number of different 
types of maps are presented to the student, and 
questions concerning distances, directions, locations 
and map legends are provided. The test consists of 


5 The first value refers to the number of items in 
the fourth-grade test; the second, to the number of 
items in the sixth-grade test. 
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27 (40) multiple-choice questions, with a time 
limit of 30 minutes. The second component of the 
work-study skills test is concerned with Graph and 
Table Reading. In this section of the test, at least 
five different types of illustrative figures are em- 
ployed (e.g. pictographs, line graphs, circle graphs, 
various tabular materials, etc.). Respondents must 
interpret the illustrations and sometimes perform 
arithmetic operations in generating the appropriate 
response. This test is composed of 24 (28) items, 
and is administered in 20 minutes. The final com- 
ponent of the work-study test investigates the stu- 
dent's Knowledge and Use of Reference Materials. 
Test items deal with the proper use of “the parts 
of a book, the globe, current magazines, the dic- 
tionary, the encyclopedia, and an atlas [Manual, 
1956, p. 67]." A total of 52 (59) items are employed 
in this test, with 30-minutes’ administration time 
allotted. 

Arithmetic. The final section of the Iowa Tests 
was concerned with the assessment of arithmetic 
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skills. This test is composed of two subscales. The 
first deals with the student’s grasp of Arithmetic 
Concepis. The logic of arithmetic computation is 
examined in this subtest. Mastery of concepts in- 
volving the number system, whole numbers, deci- 
mals, fractions, ratios and percentages, standard 
measures, and geometric figures is examined in this 
test of 36 (45) items, for which 30 minutes is al- 
lotted. In the Arithmetic Problem Solving subscale 
of the Arithmetic Skills test, actual computational 
expertise is assessed. All the items in this test are 
of the word-problem variety. None involve mere 
calculation, but demand that the student read the 
item and respond to the relevant aspects under in- 
vestigation. This test is composed of 27 (31) items, 
and can be administered in 30 minutes. 

In total, the Iowa Test of Basic Skills consists of 
425 (487) items administered in 4 hours and 39 
minutes of working time. The Lorge-Thorndike 
intelligence test is composed of a total of 164 items, 
and can be administered in 61 minutes. 


TAB 


Means, STANDARD DEVIATIONS, AND MATRIX or INTERCORRELA 


1, Vocabulary (4)*| 1.0 
2, Read Comp (4) | 743 1.0 
8. Spelling (4) | 593 584 10 
/4. Capitalisa- (4) 
tion 602 028 635 1.0 
5. Punctuation (4) | 500 549 644 651 1.0 
6, Usage (4) | 697 678 573 611 557 
7. Map Reading(4) | 560 677 437 497 434 
8. Graphs (4) | 613 642 483 565 [D 
9. References (4) | 501 630 — 548 597 518 
10. Ar. Concept (4) | 614 624 514 576 525 
11. Ar. Problem (4) | 534 572 499 556 502 
12. Composite (4) 
Ach 822 858 689 740 659 
18. VerbalIQ (4) | 767 735 663 646 548 
14. Nonverbal (4) 
IQ 527 552 452 528 4n 
15. Composite | (4) 
IQ 703 702 606 641 556 
16. Vocabulary (6) | 752 706 503 538 469 
17. Read Comp (6) | 718 723 504 553 — 487 
18. Spelling (6) | 596 577 788 575 511 
19. Capitalisa- (6) 
tion 590 592 569 645 574 
20. Punctuation (6) | 561 563 550 576 562 
21. Usage (6) | 656 631 535 555 495 
22. Map Reading (6) 523 529 373 461 411 
23. Graphs (6) 497 505 867 450 381 
24. References (6) | 649 664 579 636 558 
26. Ar. Concept (6) | 670 561 454 522 469 
26. Ar. Problem (6) | 507 523 434 513 457 
27. Composite (6) 
Ach 787 738 585 636 570 
28. Verbal IQ (6) 708 680 582 593 509 
29. Nonverbal (6) 
1Q 525 — 594 — 438 — 520 — 458 
30. Composite (6) 
IQ 659 — 049 — 540 6505 6516 
M 4.17 3.90 433 4.18 4.11 
o -900 979 1.10  .898  .973 


1.0 
53 10 

55 606 10 

567 53 056 10 

5:0 8588 6051 5&0 10 

528 — 487 685 677 04 10 

700 0606 763 754 78 790 1.0 

608 — 509 — 650 640 664 585 807 10 

504 8510 52 49 52 42 006 673 19 

654 603 6003 621 676 589 782 901 92 

675 — 540 — 501 — 542 572 486 740 718 B 

450 — 5661 624 559 602 627 745 716 556 

681 4» — 472 637 497 476 09 072 45 

603 474 597 637 559 501 677 639 5M 

690. 448 — 501 — 509 — 594 496 — 044 62W — 499 

707 — 458 535 513 525 485 08 675 502 

49 511 58 — 45 627 447 580 559 BOL 

449 — 490 — 590 445 — 6501 432 563 529 469 

822 54 69 9 62 5M 74 T2 5 

519 508 — 575 — 498 — 604 535 $050 6m 50 
473 430 — 529 — 488 — 550 534 600 568 #8 
700 587 658 016 658 556 79 70 603 
60 560 6n 572 62 B0 74 815 634 
504 498 — 548 48] 52 49 61 09 728 
627 86 018 563 63 558 T0 784 79 
3.06 4.05 3.81 4.20 4.04 4.08 406 99.2 98. 
1.15 .807  .884  .8/4 .800 .69 774 14.5 15.7 


® Decimal points for all correlations less than 1.0 have been omitted. The parenthesized figure following the title of each test refers 
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RESULTS 


The matrix of correlations among the 
various subscales employed in this investi- 
gation with means and standard deviations 
for each subscale, over both measurement 
sessions, is presented in Table 1. Normed 
grade equivalents, based upon the number 
of correct items answered per scale, are the 
basic unit of data of the Iowa achievement 
tests. In the Lorge-Thorndike test, raw 
item scores were adjusted by the respond- 
ent’s age in forming the IQ scores used in 
this analysis. In addition to these values, 
composite scores for both the intelligence 
and the achievement tests consisting of a 
simple average of their respective subscale 


LE 1 


265 


scores were derived, and are also presented 
in Table 1. 

With this information, an estimate of the 
viability of one of the assumptions necessi- 
tated by the use of the cross-lagged panel 
technique can be made. As was noted earlier, 
this approach, in and of itself, does not en- 
able the investigator to choose between one 
of two hypotheses, but rather between pairs 
of logical possibilities. In the present investi- 
gation, however, one hypothesis of each of 
these competing pairs (ie. the negative 
relationships) was discarded as extremely 
implausible, Information consistent with 
this assumption is presented in Table 1. 
The direction of correlations between fourth 
and sixth grade tests, for example, is uni- 


TIONS FOR ALL SUBTEST AND COMPOSITE SCORES, TOTAL SAMPLE 


s|e[visi»|[a|a[a] 


1.0 
94 10 

63 so 10 

5? — 6&1 58 10 

99 — 014 ex 060 10 

610 580 604 605 72 1.0 

640 695 681 602 685 61 10 

679 S678 600 434 614 4r] 485 10 
B6 543 — 595 418 492 452. 47 — 580 
TI! — 679 735 650 706 68) 678 635 
S41 610 — 64 621 596 — 682 575 — 610 
WI 537 859 498 50 59 5277 555 
747 — 858 s 699 757 731 T3 7€ 
70 738 750 6s 647 601 65 672 
T55 551. 589 — 464 55 562 542, 6529 
98 601 716 597 64 638 603 588 
959 591 5.08 6.21 6.57 6.26 6.00 5.98 
1.8 138 138 1.29 1.62 1.82 1.87 12 


a |u| a | x 


1.0 
620 10 
685 — 709 1.0 
551 674 673 1.0 
833 175 728 1.0 
D 732 642 677 7900 10 
501 — 648 599 517 650 — 781 1.0 
923 1.0 
559 — 738 663 584 770 932 
6.12 6.86 — 5.97 5.95 6.06 99,6 99.8 99.7 
116 1.14 -809 .920 1.00 15.4 145 14.0 


5 ipti rovided in the text. 
to the ad ISten tar ick GU peri ‘teat Mi complete titles and descriptions for each subscale are p 
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Fia. 2. Cross-lagged panel results of the inter- 
relations of intelligence and achievement test com- 
posite scores. 


formly positive. In fact, not a single negative 
relationship appeared in the entire matrix 
of correlations. 

Additional confirmatory information can 
be obtained by considering the composite 
score relationship of the synchronous un- 
lagged IQ and achievement tests. The corre- 
lation between these contiguously adminis- 
tered tests is positive and significant at both 
measurement periods 


(Tia = 7815, Tras = -7700; p < .001, 
df = 5493 for both correlations). 


Both of these findings serve to render im- 
plausible the rival hypothesis that a negative 
relationship exists between intelligence and 
achievement as measured on the tests em- 
ployed in this investigation. We are thus in 
a position to investigate the remaining 
possibilities, namely, that the causal rela- 
tionship is predominately in the direction of 
intelligence affecting later achievement, or, 
of achievement influencing later intelligence. 

A number of analytic options is available 
in this study, but none is completely de- 
sirable. One of the most obvious of these 
consists of a comparison of the crossed 
and lagged composite score correlations 
(rras, TA,n). Again, we must emphasize 
the probable reciprocal causal dependence 
between these two dimensions. It seems 
highly probable that both of the possible 
causal relationships operate to some extent, 
in a type of feedback system. The test be- 
tween the cross-lagged coefficients simply 


*'The relationship between the intelligence and 
achievement tests employed in the present investi- 
gation appears to be consistent over time. A test 
of significance between these two correlations dis- 
closed that the null hypothesis that rr4, = Tress 
could not be rejected (2 = 152, p > .05). 
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enables some estimate concerning the pre- 
ponderant cause-effect relationship to be 
made. The pattern of relationships necessary 


for this comparison is presented in Figure 2. . 


The cross-lagged correlations are both posi- 
tive and substantial, and suggest a feedback 
system in which both operations affect one 
another to a great extent. A comparison of 
the cross-lagged correlations indicates, how- 
ever, that the predominant causal sequence 
is that of intelligence causing later achieve- 
ment. A test of this inequality revealed that 
the obtained difference between r;,4, and 
Tara Was statistically significant (t = 2.941, 
df = 5492, p < .01, two-tailed). For the 
total group of respondents, then, the pre- 
ponderant causal sequence is apparently in 
the direction of intelligence directly predict- 
ing later achievement to an extent signifi- 
cantly exceeding that to which achievement 
causes later intelligence. 

The same causal sequence may not, of 
course, operate in all groups. Of extreme 
importance today, for example, is the ques- 
tion of whether the pattern of causal rela- 
tionships obtained from data on students 
in inner-city schools would be similar to 
that obtained from a suburban sample. To 
consider this question, schools were divided 
into core and suburban samples. A core 
school was one that was eligible for compre- 
hensive programs of aid under Title 1 of 
the Elementary and Secondary Education 
Act, for the 1967-68 school year. Upon 
recalculating the matrix of correlations for 
both core and suburban samples, it was 
found that among suburban students, the 
intelligence-causes-achievement sequence 
based on a consideration of composite 
scores clearly predominated (rr, = -7829; 
Tan = 7049, £ = 3.479, df = 3991, 
p « .001, two-tailed). Within the core 
sample, the direction of differences between 
the cross-lagged correlations was opposite to 
that of the suburban group (rz, = -6086, 
Tar = 6180, ¢ = —.521, df = 1498, 
p > .05). Although this finding was not 


* This test was based upon a correction of the 
usual ¢ test between correlations, suggested by 
Pearson and Filon (1898), which takes into account 
the indirect correlation between the arrays under 
comparison, which are modified by the four other 
relevant values (see also Peters and VanVoorhis, 
1940, p. 185). 
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statistically significant, the directional differ- 
ences between the core and suburban sam- 
ples might be used to stimulate a good deal 
of theoretical speculation regarding the 
nature of the predominant causal sequence 
in relatively advantaged and relatively de- 
prived groups. 

Before undertaking an action of this type, 
however, one should be aware of the limita- 
lions which the use of composite scores 
imposes. The Iowa tests composite, for 
example, consists of the average score of 11 
widely varying subscales. It seems unlikely 
that such a heterogeneous combination 
could prove meaningful. Two students shar- 
ing the same composite score, for example, 
might well have completely different pat- 
terns of correct and incorrect responses. 
Similarly, in the Lorge-Thorndike intelli- 
gence test composite, verbal and nonverbal 
skills are combined to give the overall aver- 
age. The meaningfulness of such an average 
ls certainly open to question. Thus, any 
Speculation based upon the composite score 
data presented above must be tempered by 
extreme caution. 

One solution to the composite-score prob- 
lem is an investigation of the relationships 
among individual subscales of the tests 
employed. In both the Iowa tests and the 
Lorge-Thorndike, the internal reliability 
coefficients of individual subscales are quite 
large. We might assume, therefore, that all 
items within any given subscale focus upon 
the same ability. In addition to providing a 
solution to the problems generated by the 
use of composite scores, such an approach 
enables a more precise investigation of the 
various relationships that exist among the 
Various skills or abilities tapped by the com- 
Ponents of the two tests which were em- 
ployed. 

To investigate the relationships among 
the individual subtests for the entire sub- 
Ject sample, a total of 78 t tests between all 
Possible pairs of cross-lagged correlation co- 
efficients is necessitated. In calculating 78 
Donindependent ¢ tests, however, the choice 
of an appropriate alpha level is a definite 
Problem. Several solutions are available (e.g. 
one might correct for multiple nonindepend- 
ent comparisons through a modified New- 
man-Keuls approach) but the most con- 
Servative appears to be that suggested by 
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Campbell, Miller, Lubetsky, and O’Connell 
(1964). To generate an appropriate compari- 
son using this approach one must determine 
a value that would occur only once in 100 X 
78 times, given a true difference of zero. To 
determine this quantity, one would derive 
the z value corresponding to a probability 
of 1/(100 X 13 X 12/2), or p = .00012820. 
For such a probability, a corresponding z 
value of 3.6559 is required. Similarly, the 
necessary value for p — .05 would be based 
upon a caleulation of 1/(20 X 13 X 12/2), 
or p = .00064102, z = 3.2202. With these 
corrected values, we can begin to investigate 
the pattern of causal relationships within 
the total sample of subjects, and within the 
two subgroups, the core and suburban 
samples. 

Before examining the differences between. 
all possible pairs of cross-lagged correlation 
coefficients, however, a final comment on 
this technique is in order. One of the major 
assumptions of the cross-lagged panel tech- 
nigue is that of “stationarity” (Rozelle & 
Campbell, 1969), that is, that the common 
factor structure of the tests employed at 
both points in time remains constant. A 
necessary consequence of such an assump- 
tion is that the synchronous correlations 
are equal at both points in time (eg., 
Tra; = Treas). An examination of Table 1, 
however, reveals that the synchronous cor- 
relations change more than would be ex- 
pected by sampling error alone, and we must 
therefore conclude that the common factor 
structure changes over time. 

Two different sources of change can be 
made to account for the synchronous corre- 
lation differences, changes in kind, and 
changes in amount. With changes in kind, 
the loading of a test on one common factor 
changes while the test’s loading on another 
common factor remains the same or changes 
in the opposite direction. A good example 
of changes of kind is provided in infant 
intelligence tests. These tests tend to meas- 
ure motor skills more than mental ability, 
while for older children, the opposite holds. 
Suppose that intelligence (I) and some motor 
skill (M) were measured at ages 1 and 5 
for the same subject sample. If frias was 
greater than rz,,, we could not conclude 
that intelligence causes motor skill, but 
rather that the two measures of motor skills 
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Fic. 3. Uncorrected cross-lagged correlations be- 
tween Graphs and Tables and References subscales 
for core school respondents. 


correlate more highly than a measure of 
motor skill with mental ability. 

With changes in amount, all the loadings 
of a test change by a multiplicative constant. 
In a sense, the common factor structure of 
the tests does not change, but there are 
changes in the amount of communality and, 
therefore, uniqueness. Consider, for example, 
the pattern of intercorrelations presented in 
Figure 35 On the basis of the cross-lagged 
correlations alone, it seems obvious that the 
ability to read graphs and tables predicts 
the ability to use references. These same 
results could have occurred, however, if (a) 
the reliability of the References test de- 
creased from Grade 4 to 6, while that of the 
other test increased, or, (b) the specific 
variance of the References test increased 
over time, and decreased for the test of 
graph and table interpretation. 

To test the viability of these alternatives, 
we could inspect the synchronous correla- 
tions of each of these two variables with all 
the other variables employed in this investi- 
gation at both measurement periods. Such 
an analysis bolsters the plausibility of the 
alternatives noted above, since, in every case 
excepi that under consideration, the syn- 
chronous correlations involving the Refer- 
ences test declined from grade four to six, 
while those of the Graphs and Tables test 
increased. Given this systematic shift in 
synchronous correlations, we felt it plausible 
to assume that the bulk of the changes in 
factor structure were changes of amount, 
not kind. 

Clearly, some means of correcting for 
differential reliability or specificity devia- 
tions that might occur between measure- 
ment periods is necessary if the full value of 
the cross-lagged panel technique is to be 


* These data were taken from the core sample. 
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realized. The simplest solution available 
consists of a correction for attenuation of 
the cross-lagged values. Although this solu- 
tion has the advantage of simplicity, it cor- 
rects only for reliability changes and cannot 
be used to control for any changes in the 
specificity of tests that might occur over time. 
A more satisfying solution would involve 
a factor analytic approach. Within the 
fourth- and the sixth-grade measurement 
sessions, a separate factor analysis of the 
matrix of test correlations could be com- 
puted. If the assumption of “changes of 
amount only” is valid, then the cross-lagged 
correlations would be equal if corrected by 
the ratio of appropriate communalities, as 
presented in the following formulae: 


2 2 
Phage me Pegs A d Becton 
wave Taye h x 2) 

T6 4 

and, 

2 2 
Mera 4) aw 
LE Vane h D A 

31771 


Conceptually, this solution seems ideal; the 
wide dispersion of communality estimates 
generated by various factor analytic tech- 
niques, however, renders this approach in- 
operable in practice. 

A more intuitive solution to the problem 
of the estimation of communality ratios 
was thus employed in the present investiga- 
tion. For each variable pair, the synchronous 
correlation at Grade 4 (rr,v,) was divided 
by the synchronous correlation of these 
same variables at Grade 6. The resulting 
matrix of ratios should be single factored if 
the “change of amount" assumption is vali 
(see Kenny, 1970, for a more formal mathe- 
matical development of these arguments). 
Spearman’s (1927) “two factor” technique 
was employed in the solution of this matrix 
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Fra. 4. Corrected cross-agged correlations be- 
tween Graphs and Tables and References subscales 
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TABLE 2 
ConnEcrED Cnoss-LAGS AND £ VALUES: TOTAL Gaovp, Corn, AND 
SUBURBAN SCHOOLS, RESPECTIVELY 
Comparison 
TY. | fran d TIY, | fran, , TXiY. | IYax, t 
Sa Y 

Vocabulary with Read Comp -7065 | .7118 | —.723| .5721 | .5845 | —.627 .6899 | .6942 | —.479 
Xocabulary with Spelling -5965 | .5023 | 9.320| .4944 | .4308 | 2.721] .5940 | .4832 9.189 
Vocabulary with Capitalizing -5914 | .5369 | 5.263| .4768 | .4530 -992| .5551 | .4980 | 4.408 
Vocabulary with Punctuation -5344 | .4930 | 3.594| .4342 | .4441 | —.389| .5035 | .4442 4.149 
Vocabulary with Usage -6562 | .6740 |—2.099| .5193 | .5791 |—2.787| .6397 | .6458 — .682 
Vocabulary with Map Reading | .5202 | .5483 |—2.456| .3519 -4099 |—2.060| .4898 | .5179 |—2.005 
Vocabulary with Graphs .5342 | .5497 |—1.385| .3496 | .3996 |—1.749| .5170 | .5236 —.488 
Vocabulary with References .6026 | .5839 | 1.851| .4479 | .4573 | —.373| .5836 | .5598 1.975 
Vocabulary with Ar. Concept .5702 | .5719 | —.168| .4113 | .4374 | —.996] .5417 | .5306 .989 
Vocabulary with Ar. Problem .5054 | .4880 | 1.467| .3548 | .3897 |—1.250| .4767 Re 1.931 
Vocabulary with Verbal IQ .7164 | .7090 | 1.027} .6133 | :5884 | 1.317] .6942 ORA Bee 
Vocabulary with Nonverbal IQ | .5134 | .5317 |—1.760| .4220 | .4090 .580| .4557 iia ee 
Read Comp with Spelling -5825 | .4988 | 8.063] .4941 | .4240 | 2.983] .5770 des ips 
Read Comp with Capitalizing | .5982 | .5472 | 4.950| .5031 | .4871 694 jue ba saad 
Read Comp with Punctuation | .5408 | .5071 | 2.945 .4691 | .4556 - 546) ape ds de 
Read Comp with Usage .6377 | .6431 | —.603| .5203 | .5473 ENS PR aie 355 
Read Comp with Map Reading | .5313 | .5589 |—2.453| .3545 | .4251 |— PA pue ge a 
Read Comp with Graphs -5480 | .5748 |—2.452| .3678 | .4201 st Sd iene Fs 
Read Comp with References .6220 | .5965 | 2.621| .4769 | .4857 mes on 1058 enar 
Read Comp with Ar. Concept | .5656 | .5965 |—2.998| .4275 | .4757 in iym 030 ds 
Read Comp with Ar. Problem | .5259 | .5244 - 126) .3906 | .4230 |— een cae pod 
Read Comp with Verbal IQ ^ | .6942 | .7006 | —.846| .6028 | .5871 Be Apel a 
Read Comp with Nonverbal IQ | .5272 | .5624 |—3.414| .4241 | .4654 |— iis oe 5710 Te, 
Spelling with Capitalizing .5697 | .5740 | —.424| .4923 a (ane ee "5380 Pia ae, 
Spelling with Punctuation i quan ae feed he eR en are KEEN 

Spelling with Usage : . Iss f, 3 R Y .4104 |--3. 
Spelling with Map Reading E ka nb epe beens i haber “alates 

pelling with Graphs . : ee sea at “4793 |-1.733] . «5751 |—3. 
Spelling with References .5304.| .5789 |—3.970] .4305 4m hie vr gU roues 
Spelling with Ar. Concept .4532 | .4979 |—3.786] .3288 “3843 |—2.092] 4901 | .4660 —3,088 
Spelling with Ar. Problem 4314 | .4786 |—3.787| id “gone | real gates ebay 29/480 
Spelling with Verbal IQ por tmp t peer teneros Non ees 
Spelling with Nonverbal IQ .4282 | .4372 | —.790) ied 75388 |—1.978| .5182 | .5733 —4.307 
Capitalizing with Punctuation | .5475 | .6045 |—5.539 por 75930 |—2.112, 5242 | .5058 |—3 241 
Capitalizing with Usage .5546 | .6032 |—4.739| A EOS — "64a! 74318 „4444 | —.818 
Capitalizing with Map Reading | .4584 | .4769 |—1.457| oe ase Eod “4661 | 4704 | — 290 
Capitalizing with Graphs .4833 | .5003 |—1.381| DU 74503 1.050) .5733 | .5561 | 1.369 
Capitalizing with References | .5887 | .5794|  .899|. 117 | 14523 |—1.564| .4898 | .6158 |—1.862 
Capitalizing with Ar. Concept | .5211 | .5548 |—2.994. ais “4090 | —.305| .4870 | .4682 | 1.280 
Capitalizing with Ar, Problem | .5101 | .5035 - 548) eae ae 1.16, .5599 | .6005 |—3,411 
Capitalizing with Verbal IQ | .5995 | .6324 |—3.523 "Aov6 | 14494 | 767| 4539 | .4900 |—2 569 
Pues vith Nonwstbal IG) 2070 ue 2L IT 
Punctuation with Usage .5204 | .5611 |—3.642 13143 | .3099 | -143| .3051 | .3970 | —.112 
Punctuation with Map Reading .4296 | .4279 | —. 126 72900 | .3075 | —.576| .4066 | .4214 | —.930 
Punctuation with Graphs .4309 | .4432 | —.929 14426 | .4111 | 1.900 .5223 | -5022 | 1.433 
Punctuation with References | .5441 | .5221| 1.911 24988 | .3953 | 1.255] .4501 | .4819 |—2.134 
Punctuation with Ar. Concept | .4929 | .5088 |—1.316 n 3806 | —.663, 4526 | -4446 .519 
Punctuation with Ar. Problem | .4781 | .4742 per! “5020 | 5011 | 076] .4956 | .5546 |—4.419 
Punctuation with Verbal IQ .5408 | .5834 |—4. 88| 4185 | .4073 | —..437, .4212 | .4428 | —1.427 
Punctuation with Nonverbal IQ| .4706 | .4850 iy “9880 | .3343 |—1.556| .4139 | .4250 | —.719 
Usage with Map Reading «4507 | AGOE o Ed. 3000 | 3508 (1.898) 4011 | 4725 | — 778 
Usage with Graphs .4827 | .4975 ee 200 28 aed A |S tor | 7 106 
Sage with References -5768 | .5532 oa ‘410 | .3985 | 770.4735 | -4923 —1.344 
Usage with Ar. Concept C5183 | oer 1410 3058 | 3899 | —.850 4318 | .4549 |—1.572 
Usage with Ar. Problem “A704 | ABT" | 0M| 5976 | 5721 | 1.821) 16453 | .6433 | —.197 
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TABLE 2—Continued 
Comparison 
TXX. | "YXs ? TXQY. | TFX: : TY. | CY. i 
x y 

Usage with Nonverbal IQ .4929 | .5130 |—1.839| .4283 | .4195 .362| .4342 | .4651 |—2.240 
Map Reading with Graphs .4964 | .4883 .666| .2677 | .2993 |—1.045| .4977 | .4756 | 1.537 
Map Reading with References | .5067 | .4947 | 1.004| .3176 .8220 | —.150| .4946 | .4822 -861 
Map Reading with Ar. Concept .5102 | .5253 |—1.278| .3091 | .3490 |—1.360| .4983 | .5026 | —.305 
Map Reading with Ar. Problem | .4396 | .4464 | — .521| .2729 | .2868 | —.453| .4182 | .4260 | —.492 
Map Reading with Verbal IQ .5696 | .5501 | 1.780| .4246 | .3467 | 2.808| .5449 | .5331 .888 
Map Reading with Nonverbal IQ) .4900 | .5096 |—1.634| .3586 | .3147 | 1.500) .4531 | .4863 |—2.248 
Graphs with References .5430 | .4154 | 2.403| .3555 | .3057 | 1.704| .5378 | .5153 | 1.607 
Graphs with Ar. Concept .5345 | .5394 | —.437| .3244 | .3341 | —.330| .5306 | .5351 | —.336 
Graphs with Ar. Problem .4899 | .4665 | 1.873| .8185 | .2534 | 2.134) .4781 | .4664 .194 
Graphs with Verbal IQ .5746 | .5622 | 1.153| .4135 | .3434 | 2.490| .5599 | .5600 | —.006 
Graphs with Nonverbal IQ .4985 | .5152 |—1.414| .3331 | .3079 .850| .4766 | .5086 |—2.254 
References with Ar. Concept .5367 | .5809 |—4.092| .3897 | .4314 |—1.578| .5179 | .5618 |—3.357 
References with Ar. Problem .5941 | .5435 |—1.700| .3555 | .3991 |—1.574| .5168 | .5316 |—1.093 
References with Verbal IQ .6234 | .6531 |—3.246) .5265 | .5194 .818| .6006 | .6373 |—3.274 
References with Nonverbal IQ | .5076 | .5678 |—5.437| .4284 | .4465 | —.717| .4660 | .5355 —5.051 
Ar. Concept with Ar. Problem | .5482 | .5368 | 1.043| .4032 | .3945 .835| .5290 | .5185 -790 
Ar. Concept with Verbal IQ .6343 | .6151 | 2.036] .5282 | .4609 | 2.846] .6003 | .5906 .824 
Ar. Concept with Nonverbal IQ | .5504 | .5617 |—1.055| .4211 | .4197 .055| .5117 | .5266 |—1.105 
Ar. Problem with Verbal IQ .5590 | .5590 | 0.000| .4727 | .4150 | 2.273| .5248 | .5359 | —.831 
Ar. Problem with Nonverbal IQ| .4844 | .4861 | —.139| .4146 | .3304 | 3.060| .4370 .4580 |—1.404 
Verbal IQ with Nonverbal IQ | .6274 | .6658 |—4.508| .5810 | .6149 —1.848| .5697 | .6152 |—4.042 


(see Harman, 1960), and the resulting com- 
munality estimates were employed in cor- 
recting the cross-lagged correlations for re- 
liability and specificity changes, through the 
use of the correction formulae presented 
above.? Employing these communality esti- 
mates in the correction formulae generally 
lessens the differences between the cross- 
lagged values. In the illustration comparing 
reference versus graph and table interpreta- 
tion, for example, the correction process has 
reduced the difference in cross-lagged corre- 
lation values from .22 to .05 (see Figure 4). 
The same general approach was employed 
in investigating all possible subscale rela- 
tions obtained over the total sample, and 


° The procedure employed here is not a true fac- 
tor analysis, because many of the ratios entered in 
the matrix will exceed unity. Nevertheless, it was 
felt that the procedures outlined by Spearman 
could legitimately be employed in this analysis, 
because we assumed that for each variable the 
unique factor loadings can freely change over time, 
while all the orthogonal common factor loadings 
change by some constant. That is, 


A: = Kå, 


where A; are the common factor loadings at time t, 
As are the common-factor loadings at time t + k, 
and K is a diagonal matrix of communality ratios 
(see Kenny, 1970). 


also within the core and suburban subsam- 
ples.? A series of ¢ tests was computed on 
the difference between all pairs of corrected 
cross-lagged values, and these results are 
summarized in Table 2. 

A more graphic representation of the re- 
sults obtained for the suburban sample is 
presented in Table 3. Again, it must be 
stressed that the £ test differences noted in 
these tables are based upon the corrected 
cross-lagged correlation coefficients, and the 
significance levels employed have been cor- 
rected for multiple comparisons. Thus it 
seems likely that these results, if erroneous, 
will be conservatively biased. 


Discussion 


In the statistical comparison of composite 
IQ and achievement test scores present 
earlier, the predominant causal sequence 
over all subjects was in the direction of in- 
telligence causing later achievement. Divid- 
ing the total sample into core and suburban 
subunits, however, revealed that this se- 


æ Within the bounds of sampling error, the ma- 
trix of ratios of synchronous correlations appeare 
to be single factored, thus supporting our assump- 
tion of "changes in amount only" in the factor 
Structure. 


— we 
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TABLE 3 
PATTERNS OF CAUSAL INTERRELATIONS: SUBURBAN SAMPLE 


Effect variables 


Cause variable 


NEBERERESESEREEENESESESE: 


Vocabulary D 2 d 


Reading compre- 
hension 
Spelling 
Capitalization 
Punctuation 
Usage 
Map reading 
Graph and table 
References 
Arithmatie con- 
cept 
11. Arithmatie prob- 
lems 
12. Verbal IQ ne A 
13. Nonverbal IQ 


BOE 


** ** 


** * * 


SS wns ape 


T 


* 


Ll ** 


*p < .05 (ie., p « .00064102, t > 3.2202, as discussed in text). 


** p < .01 (i.e., p < .00012820, t > 3.6559). 


quence held only within the suburban 
Sample; if any relationship existed in the 
core sample, it was opposite to that of the 
Suburban group. 

Given the dangers involved in the use of 
composite scores, it is perhaps wise to focus 
upon the more specific subscale relationships 
before commenting upon this result. With 
18 subtest scores employed in this investiga- 
tion, 78 comparisons are possible. Having 
adjusted alpha to account for these multiple 
Comparisons, 22 significant, differences were 
found in an analysis involving all subjects 
(Table 2). (Without the alpha adjustment, 
33 of 78 ¢ values would have reached the 
P < .05 level.) Within the suburban sample, 

7 comparisons were significant; among the 
core students, however, not even one of the 
78 t values was significant. 

The results of the analysis of the total 
group thus provide a somewhat misleading 
Impression, since the significant causal rela- 
tions obtained depend almost completely 
Upon differences that exist within the subur- 
ban sample." A further indication of the 
lack of comparability of the core and subur- 
ban groups can be gained through a consider- 
ation of the differences in causal direction- 
ality that exist among the various subscale 

“This is understandable, since the suburban 
group constitutes 73% of the total sample. 


comparisons in these two groups. In almost 
40% of the 78 subscale comparisons, the 
signs of the obtained ¢ values differ between 
core and suburban samples. The difference 
between the core and suburban groups in 
mere numbers of significant causal relation- 
ships is quite striking, as is the directional 
difference in the composite-score compari- 
son; neither of these findings, however, is as 
compelling as the fact that causal direction- 
ality of the relationships between various 
concrete and abstract activities differs be- 
tween these groups almost 40% of the time. 
On the basis of these results, it is clear that 
a combination of the data from the core and 
suburban subjects can be extremely mislead- 
ing. These findings should thus be ap- 
proached with extreme caution. For this 
reason, the following discussion will be 
focused upon results obtained for the two 
subgroups separately. , 
There are probably a number of potential 
approaches in explaining the causal discrep- 
ancies in the findings above, and one of the 
most promising is an investigation of the 
results obtained from the suburban sample. 
A plausible explanation of these findings can 
lead to a more complete understanding of 
the lack of significant causal effects within 


the core group. huh 
The Iowa Tests of Basic Skills is com- 
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posed of 11 subtests, the first 6 of which 
clearly depend upon linguistic abilities (see 
Table 3). While the general focus of these 
subtests is similar, the skills which they 
assess vary in degree of abstractness. On the 
basis of both the descriptive manual pro- 
vided for the Iowa tests (1956), and an in- 
vestigation of the specific items that con- 
stitute the various linguistically oriented 
subscales, it would seem that the tests of 
Vocabulary, Reading Comprehension, and 
Language Usage appear to represent scales 
that assess abilities more abstract than those 
tapped in the Spelling, Capitalization, and 
Punctuation subtests. If this evaluation is 
correct, then the results in Table 3 indicate 
that the acquisition of the more general, 
abstract cognitive abilities causes later gains 
in more specific linguistic skills. In addition 
to supporting this abstract-to-concrete ex- 
planation of linguistic development, data in 
Table 3 also demonstrate the causal ineffec- 
tiveness of the concrete skills in generating 
abstract abilities. Both Vocabulary and 
Language Usage, for example, appear to 
function as causal determinants of Spelling, 
Capitalization, and Punctuation skills; 
Reading Comprehension is somewhat less 
effective, and apparently affects only Spell- 
ing and Capitalization. 

Results consistent with these findings are 
to be found in a consideration of the effects 
of the test of Verbal IQ. This subscale of 
the Lorge-Thorndike intelligence test is a 
clear attempt to assess skills considerably 
more abstract than those measured in the 
Spelling, Capitalization, and Punctuation 
subscales of the Iowa tests. As demonstrated 
in Tables 2 and 3, the more abstract abili- 
ties tapped in the Verbal IQ test serve as 
causal determinants of these concrete skills, 
just as did those assessed in the tests of 
Vocabulary and Reading Comprehension. 

A review of the remaining scales of the 
Iowa Tests provides relatively little infor- 
mation concerning possible causal relation- 
ships among the various skills assessed 
through this device. Both the Work-Study 
and the Arithmetic subtests assess relatively 
abstract skills. The subscales of these tests, 
however, are only minimally effective as 
predictors of other skills. 

Much the same might be said of the non- 
verbal portion of the Lorge-Thorndike intelli- 
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gence test, perhaps the most abstract of all 
thescales employed in this investigation; such 
an assessment of the causal efficacy of this 
scale, however could be extremely mislead- 
ing. The results indicate that nonverbal in- 
telligence does not directly influence the 
acquisition of the concrete skills. But as data 
in Tables 2 and 3 show, nonverbal intelli- 
gence apparently causally influences verbal 
IQ, an ability which, in turn, is a predictor 
of many of the more concrete linguistic 
skills (spelling, capitalization, punctuation, 
reference usage). 

The findings indicate that an abstract-to- 
concrete causal sequence of cognitive ac- 
quisition predominates among suburban 
school children. The positive and often sta- 
tistically significant cross-lagged correlation 
values (Table 2) also indicate that the con- 
crete skills act as causal determinants of 
abstract skills; their causal effectiveness, 
however, is not as great as that of the more 
abstract abilities. Taken together, these re- 
sults suggest that the more complex ab- 
stract abilities depend upon the acquisition 
of a number of diverse, concrete skills, but 
these concrete acquisitions, taken inde- 
pendently, do not operate causally to form 
more abstract, complex abilities. Appar- 
ently, the integration of a number of such 
skills is a necessary precondition to the 
generation of higher order abstract rules or 
schema. Such schema, in turn, operate as 
causal determinants in the acquisition © 
later concrete skills. 

‘A review of Tables 2 and 3 lends support 
to this observation. None of the more spe 
cific concrete skills assessed in the various 
subtests employed in this investigation 
(Spelling, Capitalization, Punctuation, Ref- 
erence Usage, Arithmetic Problem Solving) 
functions as a major causal determinant in 
either the core or suburban sample. he 
more abstract abilities (Vocabulary, Read- 
ing Comprehension, Language Usage, Ver- 
bal and Nonverbal IQ), however, are 
clearly effective in determining later, more 
specific achievement. 

This pattern of findings might be ex- 
plained in terms of a simple statistical arti- 
fact, in that there would appear to be > - 
greater possibility for test-specific irrele- 
vancies to cancel in tests involving more 
complex cognitive operations. The tests that 
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focus upon the assessment of a single skill or 
acquisition seem to be more vulnerable to 
the accumulation of such error (i.e., test- 
specific bias), which would vary from ad- 
ministration to administration. The attend- 
ant test-retest reliabilities of such tests 
would be adversely affected, and this, in 
turn, would lessen the chances of obtaining 
significant ¢ differences in the tests em- 
ployed in the assessment of preponderant 
causal relationships. The rather impressive 
reliabilities of the tests, as reported in the 
technical manuals (Lorge & Thorndike, 
1957; Iowa Manual, 1956), and the relia- 
bility-specificity correction process described 
earlier, however, severely limit the plausi- 
bility of this alternative. 

A more probable explanation of the re- 
sults obtained is that the preponderant 
causal sequence is indeed most accurately 
described as a progression from the abstract 
to the concrete. The ability to form ab- 
stractions (ie., to employ general complex 
rules or schema) results in the absorption 
and retention of more concrete information 
and skills. The opposite sequence holds, but 
In an attenuated fashion. A specific con- 
crete acquisition, perhaps a necessary com- 
ponent in the formation of a more general 
tule, is causally ineffective unless it can be 
integrated with other concrete acquisitions 
in generating a more abstract cognitive 
schema, Taken independently, specific con- 
crete skills and information are not effective 
determinants of abstract rules. Apparently, 
the acquisition of a combination of diverse 
(concrete) skills is a necessary, but not 
sufficient, condition for the formation of 
abstractions. 

This observation might provide a key to 
the explanation of the complete lack of 
Significant causal relations in the core sam- 
Ple. The assimilation of specific concrete 
skills may proceed within the core schools 
at a pace so retarded that the integration 
Necessary for the generation of abstract 
Schema simply cannot take place. If this is 
80, then the orderly feedback sequence of 
skill acquisition and integration would be 

ipted. 

Some evidence supportive of such an in- 
terpretation is available. Statistical tests 
Comparing the average scores of each of the 
achievement subscales for the core and 
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suburban subsamples were performed on the 
data used in this investigation. At the fourth- 
grade level, differences in normed grade 
equivalents between core and suburban 
achievement test scores for each subscale 
were highly significant, with suburban 
children outscoring core school students. 
These differences not only were maintained 
at the sixth-grade level, but in 10 of 11 
subscales, were greater than those noted in 
the fourth grade. In 7 of these 10 instances, 
the t ratio had also increased from fourth- 
to sixth-grade test administrations, The 
suburban school children greatly outper- 
formed the core students in the fourth grade, 
and lengthened their lead when tested 2 
years later. 

Discontinuities in this type of scholastic 
achievement have been noted many times 
in the past (Harlem Youth Opportunities 
Unlimited, Inc., 1964; Hentoff, 1966; Kohl, 
1968; Kozol, 1967); the contribution of this 
apparently redundant finding of the present 
investigation lies in its potential utility in 
generating an understanding of the dynam- 
ics involved in the short-circuiting of the 
intelligence-achievement sequence of cog- 
nitive acquisition in evidence among edu- 
cationally deprived groups. To be sure, the 
mere assimilation of concrete academic 
skills is retarded within the core schools. 
This is unfortunate, since the core-school 
children—the products of this educational 
system—have, in absolute terms, less of the 
information which is necessary for survival 
in today’s society. The ramifications of this 
deprivation, however, are even more devas- 
tating, since the data of this investigation 
indicate that a retardation in the mere ac- 
cumulation of specific skills and information 
results in an attenuation of the rate at which 
higher order cognitive organization prin- 
ciples are formed. d y 

In any study that investigates issues as 
complex as those discussed here, alternative 
explanations are almost always available. 
Thus, the reader should be aware of some 
of the more persuasive limitations on the 
generalization of the findings presented 
above. The ideal study of this type would 
have employed very young children as the 
primary respondents to obtain a more defini- 
tive picture of the intelligence-achievement 
relationship, unaffected by the interaction 
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of concrete and abstract cognitive functions 
occurring over time. Unfortunately, the re- 
liable assessment devices necessary for such 
an investigation simply do not exist (see 
Bayley, 1955). Results of tests of children 
younger than those in the present investiga- 
tion, and appropriate for use in a cross- 
lagged panel investigation, may well be 
available, but whether these data would 
have been any less susceptible to potential 
temporal-interactional confounding than 
those employed is debatable. 

Another, perhaps more telling, objection 

that could be raised in response to the find- 
ings of this investigation concerns the choice 
of achievement test employed. The Iowa 
Tests of Basic Skills is not an example of the 
typical achievement test. In his review of the 
Towa tests, for instance, Herrick (1959) 
commented: 
This test battery cannot be considered as an 
achievement battery in the usual sense of measur- 
ing knowledge in the common content areas of the 
elementary school curriculum.... The focus of 
these tests is on the evaluation of the generalized 
intellectual skills . . . not on content per se [p. 16]. 

Both Morgan (1959) and Remmers (1959) 
made similar evaluations and each empha- 
sized the strong resemblance between the 
content of the Iowa tests and that found in 
most group tests of intelligence. A eritie of 
the present investigation could employ this 
marked similarity to question the obtained 
results, since both of the assessment devices 
focused upon the same general skills and 
abilities; thus, any causal differences ob- 
tained (between conceptually identical 
scales) could be considered artifactual. 

Such an argument, however, would force 
the critic to posit a number of extremely 
tenuous assumptions. For example, the 
degree of generality of the achievement test 
must closely approximate that of the intel- 
ligence test if this alternative is to be enter- 
tained. In certain subtests, this proposition 
might prove acceptable. Many of the 
achievement subscales, however, quite ob- 
viously do not approximate the degree of 
generality of the intelligence test. Further, 
these relatively concrete tests of specific 
acquisitions are the very ones that most 
often prove to be determined by the more 
general cognitive skills. In view of these 
findings, the use of a more concrete achieve- 
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ment assessment device would likely have 
enhanced the differences obtained. The ab- 
stract-to-concrete causal sequence suggested 
by the data of the present investigation, 
that is, would probably be demonstrated 
even more clearly in a study involving the 
use of an instrument focused upon the 
knowledge of very concrete specific skills 
and information. 

Such a supposition need not be left to 
speculation. Most educators would agree 
that the testing policy of the schools sam- 
pled in this investigation is not an unusual 
one. Educational systems throughout the 
country commonly employ both intelligence 
and achievement batteries in the systematic 
assessment of their students’ accomplish- 
ments. The cross-lagged panel correlational 
technique enables the educator to test the 
wisdom of this strategy, to decide between 
competing test batteries, and thus gradually 
to improve the quality of his assessment 
operations independent of test construc- 
tors’ often inflated claims. 

The use of this method in a systematic 
program of investigation would not neces- 
sarily demand a great deal of the educator, 
since, in many school situations, the neces- 
sary data are already available. If, for 
example, only a minimal temporal separa- 
tion exists between the administration of 
two or more standard assessment devices 
(IQ tests, achievement tests, etc.), and such 
tests are administered two, three, four, OT 
more times throughout the students’ aca- 
demic careers, then the basic raw data needs 
for the proper use of the cross-lagged panel 
analysis are satisfied.” 

Tf educators throughout the country were 
to embark on an investigative program © 
this type, a more certain assessment of the 
sequence of cognitive development could 
result. Arising through the combined efforts 
of numerous investigators, employing many 
different tests and diverse subject popula- 
tions, these combined results would prove 
quite resistant to counterargument. Clearly; 
the reliable confirmation of either of the 
two competing causal hypotheses discuss 


2: Ideally, information detailing item difficulties 


over the entire scale, or subscale reliabilities (split 
half, Kuder-Richardson, etc.) for each wave of test- 
ing would also be obtained. 
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above would have massive implications for 
educational policies and practices. 

It is our hope that this paper, and the 
analytic technique that has been proposed, 
wil stimulate a program of this nature. 
The problem to which this report has been 
addressed is a real and important one and 
the data for its solution are already avail- 
able—all that remains is their proper em- 
ployment. 
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EFFECTS OF NOTE TAKING AND RATE OF PRESENTATION 
ON SHORT-TERM OBJECTIVE TEST PERFORMANCE 
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This study investigated the effects of note taking on listening. Eighty- 
two undergraduates were assigned to two note-taking conditions and 
to one of three presentation conditions in a 2 X 3 analysis of variance 
design. The oral reading rate and listening efficiency of the subjects 
were assessed. Subjects not engaged in taking notes scored signifi- 
cantly better on the criterion measure. No differences attributable to 
the presentation mode were found. Aptitude X Treatment interaction 
analysis suggested, in general, low scorers on the aptitude measures 
(low-efficiency listeners) performed better when the material was 
presented at a normal rate or read and when not required to take 


notes. 


Although note taking is advocated by 
most teachers, some students contend that 
taking notes during a lecture hampers their 
listening comprehension. The students 
maintain that while they are busy writing 
down one point, they do not hear others. 
Surprisingly, there has been little system- 
atic effort to determine whether or not the 
student instrumental activity of note taking 
actually improves performance as measured 
in subsequent testing situations. The few 
studies dealing with the effects of note tak- 
ing on recall offer mixed support for the 
value of this activity. Crawford (1925) and 
McHenry (1969) reported significant dif- 
ferences favoring note takers on true-false 
and multiple-choice tests administered im- 
mediately following a study period. Mc- 
Henry found all three of his note-taking 
treatments (copious, abbreviated, and fact- 
principle) had a significant effect. All 
three groups scored higher than a no-note 
control group on a multiple-choice listening 
comprehension measure. Peters and Harris 
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(1970) also indicated that subjects permit- 
ted to take notes during a taped presenta- 
tion, or who were provided with prepared 
notes in topical outline form, performed sig- 
nificantly better on a subsequent multiple- 
choice test than a no-note control group. 
However, other studies provide no support 
for the advantages of note taking (Eisner & 
Rohde, 1959; McClendon, 1948; Pauk, 
1963). q 

One variable that had been either experi- 
mentally controlled or allowed to vary ran- 
domly in the above studies was the rate at 
which the material was presented to the 
subjects. It would seem reasonable that a 
sharp increase in the rate of presentation 
would decrease the benefits accrued from 
taking notes. That is, taking notes during à 
very rapid presentation may interfere with 
listening, while at slower speeds, it may en- 
hance listening by increasing the concentra- 
tion of the student. This would suggest that 
there would be a crossover rate at whic 
note taking would make no difference 1n 
performance. A relationship of this sort 
would help to explain some of the conflict- 
ing results appearing in the note-taking lit- 
erature. 

The question then arises as to where the 
crossover might occur. Although research is 
lacking, since the above studies did not re- 
port inordinately rapid rates of presenta- 
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tion, one might expect the crossover to be 
within the range of the normal rate of 
speech; that is, between 125-200 words per 
minute (Johnson, 1966; Nichols & Stevens, 
1967; Oliver, Felko, & Holtzman, 1966; 
Taylor, 1964). Similarly, since others have 
reported that the rate of presentation with- 
out note taking affects comprehension only 
at very high rates (300 words per minute or 
more) (Fairbanks, Guttman, & Miron, 
1957; Goldstein, 1940; Jester, 1966; Nelson, 
1948; Orr, Friedman, & Graae, 1970), dec- 
rements in performance occurring with rates 
in or near the normal range must be attrib- 
uted to some other form of interference. 
Research in the area of note taking and 
its effects on listening has seldom taken in- 
dividual differences of the learner into ac- 
count. Peters and Harris (1970) investi- 
gated the effects of several global learner 
personality variables on performance with 
disappointing results. Of several possible in- 
teractions, only one was found to reach an 
acceptable level of significance. Subjects 
Scoring low (tolerant) on a measure of in- 
tolerance for ambiguity (Budner, 1963) 
demonstrated inferior learning when not 
permitted to take notes, whereas subjects 
Scoring high on this measure showed no dif- 
ferences in performance whether or not they 
were permitted to take notes. 
, A more promising area of search for the 
Interaction of individual differences with 
learning conditions is to look at variables 
closely related and relevant to the demands 
of the learning situation. Where presenta- 
tion rates are varied, two such variables 
might include the individual learner’s own 
tate of speech and his listening efficiency. 
The purpose of the present research was 
to determine: (a) the effect of note taking 
On listening efficiency as measured by an 
Immediate objective test of learning; (b) 
the effect of variations in the rate of presen- 
tation; and, (c) the possible interaction of 
the two. Additionally, the effects of Apti- 
tude x Treatment interactions on perform- 
ance were investigated utilizing aptitude 
variables thought, to be closely related to 
the learning situation: The two variables se- 
lected were listening efficiency and oral 
Teading rate. Both were thought to be re- 
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flective of the rate of information process- 
ing of the individual and hence to be re- 
lated to the independent variables of the 
present study. 


Meruop 


Subjects 


Eighty-two undergraduate students enrolled in an 
introductory educational psychology course served 
as subjects for the study. Their participation 
earned them points toward their course grade, 


Materials 


Fifty social-psychological terms appearing in 
Kretch, Crutchfield, and Ballachey (1962) were 
randomly sorted into three lists: List A—com- 
prised of 20 definitions and included a total of 436 
words with the terms and their definitions ranging 


TABLE 1 


SocrAL-PsycHOLoGICAL TERMS USED IN 
ASSESSMENT OF LISTENING 


EFFICIENCY 
Test A Test B 
Source list A term Source list B term 
Marginal man Neutral region 


Role conflict 

Net connectivity 
Leader 

Group ideology 
Cultural premises 
Status 
Repression 
Autism 

Prestige want 


Span of apprehension 
Cognitive multiplexity 
Assimilation 
Ethnocentrism 
Pluralistic ignorance 
Communication 
Connotative meaning 
Anticipatory socializa- 
tion 
Autokinetic phenome- 
non 
Mores 


Source list C term 


Role incompatibility 
Communication net 
Group structure 
Core culture 
Withdrawal 
Substitute goal 
Mental set 
Head 
Cognitive intercon- 
nectedness 
Adaptation level 
Prejudice 
General.. persuasibility 
Language 
Status discrepancy 
Position 
Counter-conformity 
Folkways 


Regression 


Denotative meaning 


Source list C term. 


Balance theory 

Halo effect 
Projection 
Unidimensional scale 
Pseudocommunication 


Causal system 
Membership group 
Cognitive dissonance 
Attitude constellation 
Reaction formation 
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in length from 13 to 32 words; List B—also com- 
prised of 20 other definitions and included 433 
words with individual items ranging in length from 
12 to 31 words; List C—comprised of 10 definitions 
and included 190 words with items ranging from 
10 to 28 words. An example of a term from List A 
is; “Mores—a class of norms which specify proper 
behavior in standard behavior events of vital 
importance to the members of society.” List A 
was recorded at 130 words per minute with the 
mean duration per item of 10.1 seconds. List B 
was recorded at 192 words per minute with the 
mean duration of items being 6.75 seconds. (Note 
that lists were confounded with rate of presenta- 
tion.) The terms on each list are provided in 
Table 1. 

The main experimental reading task was a 1,613- 
word passage of scientific material, “Steel as an 
Alloy,” adapted from Ausubel (1963). This passage 
was also recorded at two rates. The “normal” rate 
was 146 words per minute and the “fast” rate 202 
words per minute. 

A 150-word nontechnical passage dealing with 
modification of social behavior was developed to 
provide a controlled stimulus for assessing the sub- 
ject’s usual oral reading rate. 


Procedure 


Fifteen subjects were randomly assigned to 
each of four listening conditions: (a) the passage 
was presented at a normal rate but the subject 
was not permitted to take notes; (b) the passage 
was presented at a normal rate and the subject 
was permitted to take notes; (c) the passage was 
presented at a fast rate and the subject was not 
permitted to take notes; (d) the passage was 
presented at a fast rate and the subject was per- 
mitted to take notes. Eleven subjects were ran- 
domly assigned to each of two reading conditions, 
one in which he could take notes and the other 
in which note taking was not permitted? The sub- 
jects who read the material were allotted a time 
period equivalent to that of the fast listening 
group. 

The subjects were administered the procedures 
individually with the taped material presented via 
earphones, They were informed at the outset that 
the study was concerned with their performance on 
a series of listening tasks, that they would have to 
listen carefully, and that they would be tested on 
the material. The sequence of events was the same 
for all subjects: they first listened to List A and 
then were tested on Test A; they then listened to 
List B and then were tested on Test B; they were 
then administered the oral reading rate test; they 
then listened to the taped lecture material; fol- 
lowed by the criterion test. The subjects in the 
note-taking treatments were told, prior to the 


*Tt should be noted that the random assignment 
of subjects to treatments was separate for the 
listening and reading conditions, though both 
samples were drawn from the same undergraduate 
course. 
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lecture presentation, “You probably should take 
notes on the material on the paper I have pro- 
vided.” Those subjects who were in the treatments 
where notes were not permitted, were instructed, 
“Listen carefully.” No mention was made of notes 
and no paper was provided the subjects who were 
in these treatments. 


Measures 

Listening efficiency. Following the taped pres- 
entation of each list of definitions, a fill-in test 
requiring recall of the terms was administered. 
Each Test A included 20 definitions from List A 
plus 5 definitions from List C as a control for 
prior learning. Test B included the 20 definitions 
from List B plus the remaining 5 definitions from 
List C. The reliabilities of the two tests were 54 
and .70, respectively. The difference between the 
subject’s scores on Tests A and B constituted a 
listening efficiency score. The use of the difference 
score, while less reliable, permitted assessment of 
the subject’s ability to process information under 
rapid presentation conditions adjusted for in- 
dividual differences under normal presentation 
conditions. 

Oral reading rate. The subject was requested 
to read aloud the 150-word passage of nontechnical 
material. The time in seconds from start to com- 
pletion constituted the measure for oral reading 
rate. The stability of the measure across two occa- 
sions during a single 1-hour testing period was 
found to be 55 (N = 29). j 

Learning criterion. A 25-item five-alternative 
multiple-choice test on the lecture material served 
as the criterion measure. The internal consistency 
reliability (K-R 20) was 52. 

Number of notes. For those treatments where 
notes were encouraged the notes were retained ant 
a count was made of the number of words they 
contained as an index of the extensiveness of the 
note-taking activity. 

RESULTS 

No signifieant differences were found 
among the six groups on any of the aptitude 
measures. Of concern was the question of 
the effects on the learning criteria of taking 
notes, variations in the presentation of the 
material, and the interaction of the two. 
The results of a three-way analysis of vari- 
ance (Notes x Presentation x Sex) with 
unequal cell sizes yielded, F = 3.12, df = 
1/70, p < .05 for the main effect due to note 
taking. Significantly more correct responses 
to the criterion test (X = 11.44) were made 
by subjects who were not engaged in taking 
notes than by those actually taking notes 
(X = 9.92). None of the other sources of 
variance were found to have significant ef- 
fects (p > .05). 
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To determine if differential learning took 
place as a result of the individual differ- 
ences in learner aptitudes in the different 
treatments, the criterion test scores of the 
subjects were regressed on the aptitude 
scores (oral reading rate, Test A, Test B, 
and listening efficiency) separately by 
treatment conditions. A comparison was 
made of the slopes of the regression lines 
thus obtained from each aptitude criterion 
pair. This procedure indicated no significant 
differences in regression slopes when either 
the Test A score (based on listening to a 
passage presented at a normal rate) or oral 
reading rate were defined as the aptitude 
variables. However, when using the Test B 
score and the listening efficiency score as 
aptitudes, the analysis yielded F = 5.34, df 
= 5/70, p < .001; and F = 4.74, df = 5/70, 
P < .01, respectively, for differences in 
slope of the regression lines obtained by re- 
gressing the learning criterion scores on the 
aptitude scores. 

Within the no-notes condition, when the 
materials were presented at a normal rate 
or read the correlations between Test B 
Score and the learning criterion were r = 
—43 and —.61, respectively. Whereas 
within the same presentation conditions, 
but when notes were not permitted, the cor- 
relations were .61 and 27, respectively. For 
the fast presentation rate a reverse pattern 
Was found. (No notes r — .55; notes r = 
17). That is, low scorers on the aptitude 
Measure performed better on the criterion 
When not taking notes during normal pres- 
entation rates or when reading. High scor- 
ers on the aptitude measure performed bet- 
ler When not taking notes during presenta- 
tions of a faster rate. 

Similarly, within the no-note-reading 
Condition the correlation of the listening ef- 

ciency measure with the criterion test 
Score was — 82 while in the notes-reading 
condition it was .20. For this aptitude 
Measure the regression slopes for the other 
Conditions were not significantly different 
although they were in the same direction as 
those for the regression of the criterion on 

est B scores, 

In general, the interactions suggest that 
OW scorers on the aptitude measures (low- 
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efficiency listeners) perform better when 
material was presented at a normal rate or 
read and when they were not engaged in 
taking notes. Note taking appeared either 
to not interfere with learning, or to be ad- 
vantageous for high-aptitude scorers. 


Discussion 


One of the purposes of this research was 
to determine the effect of note taking upon 
performance on an immediately adminis- 
tered multiple-choice test. The results have 
indicated a deleterious effect of such activ- 
ity. This finding is contradictory to the re- 
sults of previous research which suggested 
either no effect, or a facilitating effect for 
note taking. Yet, for all three presentation 
conditions in the present study, the effect 
holds. 

The differences between the notes and 
no-notes conditions for the fast taped pres- 
entations were not significantly different. 
This suggests the hypothesized crossover 
point may be close to this presentation rate 
value (that is, about 200 words per min- 
ute). As such, variations in presentation 
rates and complexity of the materials from 
study to study are probable contributors to 
the diversity of the findings. 

That there were no significant differences 
across presentation variations, one of which 
involved reading rather than listening, 
raises the question as to whether or not note 
taking actually affects listening behavior or 
some other information processing behavior. 
That is, the present evidence suggests that 
note taking limits the amount of informa- 
tion processed or stored, whether it is pre- 
sented orally or in written form. 

The analysis of the Aptitude x Treat- 
ment interactions provides additional clari- 
fication of the relationships between rate, 
note taking, and the learner variables. The 
deleterious effect of taking notes was least 
evident in the efficient listeners (high scor- 
ers on Test B and the difference measure 
Test A — Test B). This group might be 
more accurately called efficient information 
processors since the measures involved more 
factors than just listening. Having first 
gone through List A and the test on that 
list, the subjects were more aware of what 


280 


was expected of them when presented. with 
List B. Although List B represented a more 
rapid presentation rate, the "efficient" lis- 
teners scored higher on the test for this list 
than they did on Test A. The "inefficient" 
listeners’ scores on Test B fell below those 
of Test A. This improvement exhibited by 
the efficient learners reflects a more adap- 
tive response for the requirements of the 
task. 

"These findings suggest there does exist an 
interaction between presentation rate and 
note taking but that the crossover point 
varies with the individual's information- 
processing efficiency. Specification of both 
the characteristics of the subjects and of the 
rates of presentation of the auditory stimuli 
therefore would seem essential if the con- 
tradictory findings of research on the effects 
of note taking are to be understood. If as 
suggested earlier, different lecture material 
contents play a role in the value of note 
taking, this variable too needs to be clearly 
specified. It appears that the effect of note 
taking on performance is more complex 
than was suggested in previous research. 
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The study was of the impact of differential resultant achievement, 
motivation on the Zeigarnik effect and on postexam error-correcting 
performance of college undergraduates. The Mehrabian Resultant 
Achievement Motivation Scales were administered to 624 subjects 
taking an anthropology course who were subsequently divided into 
three resultant achievement motivation levels. After administration 
of a midsemester course examination, all subjects were asked to recall 
items from the test. Approximately half of the group received feed- 
back about their initial test performance. Two weeks later all sub- 
jects were given an unannounced retest on the same exam. The 
Zeigarnik effect was not found to be related to resultant achievement 
motivation, High resultant achievement motivation females more 
frequently made corrections to initially failed items than did those 
in the low resultant achievement motivation group. Feedback was 
more effective than no feedback at each resultant achievement motiva- 


tion level for both sexes. 


Atkinson (1953) had demonstrated that 
the Zeigarnik effect, or tendency to recall 
incompleted tasks rather than completed 
tasks, is a function of an individual's 
achievement motivation. This is to say that 
those subjects classified as being high in 
need-achievement will tend to remember 
More incompleted tasks than completed 
tasks, while for those classified as being low 
In need-achievement the reverse will be ob- 
served, It was later realized (Atkinson & 
Litwin, 1960) that qualitative differences in 
Motive structure may be of more impor- 
tance to the recall of tasks than are differ- 
ences in motive intensity. Consequently a 
measure of resultant achievement motiva- 
tion which incorporated an individual's 
level of test anxiety was proposed. This 
Measure involved the subtraction of Mand- 
ler and Sarason’s 1952 TAQ score from Mc- 
Clelland’s TAT score, after transforming 
the raw scores into standard (z) score form. 
Tt was expected that resultant achievement 
Motivation would be indicative of a person’s 
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motive to achieve success relative to his mo- 
tive to avoid failure. 

Weiner, Johnson, and Mehrabian (1968) 
studied the relationship between resultant 
achievement motivation and the Zeigarnik 
effect in a nonlaboratory achievement set- 
ting, the college classroom, using Atkinson’s 
difference measure as well as Mehrabian’s 
Resultant Achievement Motivation Scales 
(Mehrabian, 1968). They assumed that 
items which were failed on a final examina- 
tion constituted “incompleted tasks” while 
those which were passed represented “com- 
pleted tasks.” Both indices of resultant 
achievement motivation were effective in 
predicting the Zeigarnik effect, with the 
Mehrabian Scales being more appropriate 
for that purpose. They observed that males 
who were high in resultant achievement 
motivation did, in fact, recall a signifi- 
cantly greater percentage of failed items 
than of passed items, when compared to 
males low in resultant achievement motiva- 
tion. In light of this finding, they hypothe- 
sized that those scoring high in resultant 
achievement motivation will repeat and re- 
hearse questions which they have missed 
more frequently than those scoring low. 
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They further assumed that these questions 
will become better learned and more likely 
remembered on a recall exercise of the test 
items. Consequently, it was reasoned that 
those subjects who are high in resultant 
achievement motivation will tend to recall 
a greater percentage of failed items than of 
passed items and should tend to get more of 
the originally missed items correct on a re- 
test, than those who are low in resultant 
achievement motivation. 

The present study attempted to deter- 
mine if this conjecture is true, particularly 
with test items in multiple-choice format, 
and for females as well as for males. As 
such, it constitutes an extension of Weiner, 
Johnson, and Mehrabian’s (1968) study 
which used a sentence-completion test and 
was limited to male students. The study 
also investigated the differential value in 
returning examination papers, with the cor- 
rect answers indicated, for review by stu- 
dents. A determination could be made of 
which group among those differing in result- 
ant achievement motivation could benefit 
most from this knowledge of error by mak- 
ing corrections on their retest responses. 


METHOD 


Instrument 


A person's score on the Mehrabian (1968) Re- 
sultant Achievement Motivation Scales was se- 
lected as the measure of need-achievement, rather 
than the standard score difference between the 
Mandler-Sarason TAQ and a thematic appercep- 
tion measure, as employed by Atkinson & Litwin 
(1960). These scales, developed from Atkinson’s 
theory of resultant achievement motivation, con- 
sisted of 26 items for each sex. Hach item required 
the subject to indicate his extent of agreement 
with a preference statement (eg. I prefer A to 
B) on a 7-point Likert-type scale. Higher scores 
indicate that a person’s motive to achieve success 
js stronger than his motive to avoid failure, while 
low scores indicate that the opposite is true. Re- 
liability and validity data are reported by Mehra- 
bian (1968). 


Subjects 


Students enrolled in either of two sections of an 
undergraduate course in anthropology in a mid- 
western university of more than 20,000 students 
constituted the sample. These two sections con- 
tained 274 and 501 students, respectively, and were 
taught by the same instructor. The course is one of 
several that are offered to meet the social science 
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requirement for all students planning to graduate 
from the university. Approximately 8076 of the stu- 
dents enrolled in the course were freshmen and 
sophomores. 


Procedure 


Two weeks prior to a typical midsemester ex- 
amination, the Mehrabian Resultant Achievement 
Motivation Scales were administered to both class 
sections, separate color-coded scales being used for 
males and females. Students were told by the in- 
vestigator of his interest in gathering data on the 
personal attitudes of many hundreds of college stu- 
dents, that names were needed for classification 

only, and that all information gathered 
would be kept confidential. Approximately 15 min- 
utes of class time was used for administration of 
this instrument. 

The midsemester examination of 52 items in 
multiple-choice format was prepared by the in- 
structor of the course and administered to all stu- 
dents. The reliability of this examination was ob- 
served to be .67 by use of Kuder-Richardson 
Formula 20, and the standard error of measure- 
ment was 32. Only one item was observed to be 
a negative discriminator (—.04) while the best was 
40. One-half the items had discrimination indices 
in the 20-30 range and one-quarter were in the 
31-40 range. Less than one-fifth of the items were 
below 20% difficulty, about one-half were between 
20%-40% difficulty, about one-fourth were between 
40%-60% difficulty, and less than one-tenth of the 
52 items were over 80% difficulty. Six hundred and 
forty-two of the 740 subjects present for this test 
were also present for the administration of the 
Mehrabian Scales. Students were told that upon 
completion of their examination, they should re- 
turn it to the proctor and receive from him a sec- 
ond part of the exam which would be used for re- 
search purposes and would not be graded. This part 
involved the free-choice recall of nine item stems 
from the preceding test, to determine if a tendency 
was present to either recall items previously an- 
swered correctly or to recall those previously 8n- 
swered incorrectly. Eighteen subjects invalidate 
their responses on either the midsemester test oF 
the recall exercise. Consequently, resultant achieve- 
ment motivation scores, midsemester test score’, 
and recall scores were available for 624 students 
(192 males and 432 females). Ae. 

One week after the midsemester examination, 
students in the large section had their examinations 
and answer sheets returned and were allowed 15 
minutes to study their corrected test. No attempt 
was made to discuss the test, and students were 
told by their instructor that if they had questions 
regarding a particular item they should write them 
down and bring them to a discussion section whic 
was scheduled to meet after the retest. Students 
absent during this feedback session were combine 
with students from the smaller section which had 
not received feedback, either about their score OT 
about incorrect items. IES 

Two weeks after the midsemester examination 
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was administered the same examination was again 
administered to all subjects in the study. A total 
of 407 subjects (290 females and 117 males) were 
present for all test administrations. Of this number, 
147 females and 65 males had received feedback, 
while 143 females and 52 males had not. This re- 
test was unannounced and to promote maximum 
performance the students were told that the score 
they earned on the retest would apply toward their 
grade in the course, the truth of which was later 
corrected. Between 60% and 65% of the original 
subjects within each group were present for the re- 
test. 

Since both sections of the sample met on the 
same day, 3 hours apart, students in the earlier 
section were asked by the investigator not to men- 
tion the restest to their friends in the other section. 
Since the students in both sections appeared quite 
cooperative, it was assumed that contamination 
caused by communication between the two sec- 
tions was kept to a minimum. 

Following the data collection, three levels of re- 
sultant achievement motivation were created by 
separating subjects into the top one-quarter, mid- 
dle half, and bottom one-quarter on the basis of 
Scores earned on the Resultant Achievement Moti- 
vation Scales. The maximum number of subjects 
available at each phase were used in the subse- 
quent analyses. 


RESULTS 


Initial Test 


The means and standard deviations on 
the initial examination are presented in 
Table 1 by sex and by resultant achieve- 
ment motivation group. The results of anal- 
yses of variance show significant differences 
among resultant achievement motivation 
groups for females (F = 3.89, df = 2/429, p 
< .05) but not for males (F < 1). New- 
man-Keuls tests indicated that this differ- 
ence was due to the high resultant achieve- 
ment motivation group outperforming the 
medium and low resultant achievement mo- 
tivation groups. 


TABLE 1 


MeansJanp Sranparp DEVIATIONS FOR 
INrrAL Exam SconEs 


Males Females 
Group 
High | Me. | Low | High | Me, | Low 
n 48 |96 |48 |108 |216 |108 
M 34.4| 33.6| 33.4| 34.1| 32.5| 32.3 
SD 4.9 5.2) 5.8| 5.2| 5.5| 5.3 
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TABLE 2 
Proportions oF MALES AND FEMALES BY RAM 
Group Wno RECALLED A GREATER 
Percentage or FarnED ITEMS 
THAN OF CORRECT FROM THE 
Initrau Test 


Group 
Subjects 
High | Medium | Low 
Males .396 .365 833 
Females .333 .435 .943 
Recall Test 


Two methods of determining recall pref- 
erence were used. One method called for the 
determination of both the percentage of 
subjects recalling more incorrect than cor- 
rect items, and the percentage of subjects 
recalling more correct than incorrect items. 
Another method called for the determina- 
tion of the gross numbers of correct, and of 
incorrect items recalled for each resultant 
achievement motivation group which were 
then reported as percentages of the total 
which were initially made by each group. 
Presumably this method would take into 
account the prediliction to recall one type 
or the other because of its initial preva- 
lence. 

Relative to the first of these methods, 
Table 2 shows the proportion of subjects in 
each of the male and female resultant 
achievement motivation groups who re- 
called a greater percentage of failed items 
than of correct items from the initial test. 
The resultant achievement motivation 
groups did not differ in the percentages of 
each group tending to recall failed items 
(and hence they did not differ in the tend- 
ency of each to recall correct items) . For 
both low and medium resultant achieve- 
ment motivation groups of males, a signifi- 
cantly larger proportion of subjects recalled 
a majority of correct items than recalled a 
majority of incorrect items (p < .05 and < 
.01, respectively). At the upper resultant 
achievement motivation level, however, a 
significant difference was not observed be- 
tween these proportions. 

For females in both the high and low re- 
sultant achievement motivation groups, 
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TABLE 3 
PROPORTION OF INITIALLY CORRECT AND 
INITIALLY FAILED TEST ITEMS 
RECALLED BY MALE AND 
Femare RAM GmouPs 


Males Females 


Group Pu TREES PR 77 017 Pe TO 
` Me- i Me- 
|High | dium Low |High | gium Low 


Initially failed 
items recalled 
Initially correct 
items recalled 


.149| .142].147|.143| .146|.133 
.175| .172|.165].175| .174.178 


again a significantly larger proportion of 
subjects recalled a majority of correct items 
than recalled a majority of incorrect items 
(p < 001 and < .01, respectively). No sig- 
nificant difference was observed for the me- 
dium resultant achievement motivation 
group (p > .05). 

Likewise, the proportions at each result- 
ant achievement motivation level who re- 
called a majority of failed items did not 
differ significantly from one another for ei- 
ther sex (p > .05). 

The results of the second method of ac- 
counting for recall preference are shown in 
Table 3. Again, the three resultant achieve- 
ment motivation levels did not differ in the 
proportion of failed items recalled, nor in 
the proportion of correct items recalled, for 
either sex (p > .05 in each case). 


Retest 


Means and standard deviations on the re- 
test by resultant achievement motivation 
groups are presented in Table 4 for the sub- 
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jects who were present during all data 
collection minus seven who were randomly 
excluded to obtain proportionality. Using 
the initial test score as the covariate, analy- 
ses were conducted to determine, by sex and 
by feedback group, if the resultant achieve- 
ment motivation groups differed in their re- 
test performance. For both males and for 
females, a significantly higher score was ob- 
served for the group receiving feedback (F 
= 47.9 and 200.2, respectively, p < 01), 
but no differences were observed among re- 
sultant achievement motivation levels (F < 
1 in both cases). 

Analyses were also performed on the per- 
centage of initially failed items that were 
correct on the retest and on the percentage 
of initially correct items which were failed 
on the retest. These data are presented in 
Table 5. Tests of differences in independent 
proportions revealed that, for the males 
who received feedback, the resultant 
achievement motivation groups did not dif- 
fer in the proportion of initially failed items 
answered correctly on the retest. For the 
group not receiving feedback, however, the 
medium resultant achievement motivation 
group had a larger proportion of their ini- 
tially failed items correct on the retest than 
did the low resultant achievement motiva- 
tion group (z = 241, p < .05). For females 
receiving feedback, the high resultant 
achievement motivation group had a larger 
proportion of their initially failed items an- 
swered correctly on the retest than did the 
low resultant achievement motivation group 
(2 = 2.06, p < .05). Similar results were 


TABLE 4 
ApjUsrED MEANS AND STANDARD DEVIATIONS FOR RETEST SCORES 
Males Females 
Group 
High Medium Low High Medium Low 
pus 

Feedback 

TER. 16 32 16 36 72 36 
; Adjusted X . 39.62 | 38.73 | 39.28 | 40.67 | 40.21 40.17 

8 543 | 600 | 6.38 | 5.13 | 6.03 | 6.07 
No feedback 

n 13 26 13 35 70 35 

Adjusted X 33.45 | 33.70 | 31.30 | 33.43 | 32.67 | 32.09 

SD 4e | 4.01 | 5.02 | 5.09 | 5.35 | 8.07 


| 
| 
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TABLE 5 


PROPORTION OF INITIALLY FAILED ITEMS THAT WERE CORRECT ON RETEST AND INITIALLY 
CORRECT ITEMS THAT WERE FAILED ON RETEST 


Males Females 
Group n a n a 
High Medium Low High Medium Low 
Feedback 
Initially failed items later correct 647 -605 -647 .701 .678 .646 
Initially correct items later failed .171 Bu .193 .142 .156 157 
No feedback 
Initially failed items later correct 304 .351 .261 +838 .309 .282 
Initially correct items later failed .179 .205 .218 .181 .204 .218 


observed for the group not receiving feed- 
back. 

When considering retest performance in 
terms of initially correct items that were 
failed on the retest, it was found that when 
the feedback and no-feedback groups were 
combined, the high female resultant 
achievement motivation group corrected a 
significantly larger proportion of their ini- 
tially failed items on the retest than did the 
low female resultant achievement motiva- 
tion group (z = 2.99, p < .01). 

Since scores on the American College 
Test (ACT) were also available from Uni- 
versity records for 493 subjects (140 males 
and 353 females), a correlation was com- 
puted against the resultant achievement 
motivation scores. This correlation was not 
Significantly different from zero for either 
Sex (r = .159 for males, .099 for females). 


Drscussiov 


The present study did not support the 
findings of Weiner, Johnson and Mehrabian 
(1968) that the Zeigarnik effect is related 
to resultant achievement motivation (ie., 
individuals who are classified as high in re- 
sultant achievement motivation tend to re- 
call a greater percentage of incorrect items 
than of correct items). Moreover, just the 
Opposite was observed; in the high female 
resultant achievement motivation group a 
Significantly larger proportion of subjects 
Tecalled a majority of correct items than 
recalled a majority of incorrect items. The 
data does not support the contention that 
the Zeigarnik effect is monotonically related 
to the strength of resultant achievement 
Motivation (ie. that there would be more 


people in the high resultant achievement 
motivation group than in the low resultant 
achievement motivation group who recalled 
a majority of failed items). These inconsis- 
tencies may be due to the differing formats 
of the criterion tests in the two studies; the 
present study employed a multiple-choice 
format while the former involved a sen- 
tence-completion task. Thus, such factors 
as item length, presence of competing re- 
sponses, and the like could come into play. 
Similarly, the difficulty of a test item could 
likely affect its chances of being recalled, 
The mean of the present test represented 
63% of the total possible score, whereas the 
mean of the test used by Weiner, Johnson, 
and Mehrabian represented 83% of the 
total possible score. 

The lack of a significant relationship be- 
tween resultant achievement motivation 
and scholastic aptitude (near-zero correla- 
tion with the American College Test) and 
between resultant achievement motivation 
and scholastic performance (no significant 
differences in initial exam scores among 
male resultant achievement motivation 
groups) was consistent with the observation 
made by Weiner, Johnson, and Mehrabian 
(1968). However, for females, the high re- 
sultant achievement motivation group did 
score higher than the other two resultant 
achievement motivation groups on the ini- 
tial test. 

The observation that the three resultant 
achievement motivation groups did not dif- 
fer on the retest, for either sex, after adjust- 
ment for initial test differences, is not sur- 
prising. Although the high female resultant 
achievement motivation group initially 
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scored higher than the low resultant 
achievement motivation group and also cor- 
rected a larger percentage of their initially 
failed items, they had fewer items requiring 
attention and correction. Consequently, the 
net gain for the two groups could remain 
equal even though the high level resultant 
achievement motivation group would ap- 
pear to correct a larger percentage. The 
equivalent net gain would thus sustain the 
initial differences, which would later be 
cancelled by the covariance adjustment. 
This situation was particularly evident be- 
tween the high and low females in both the 
feedback and no-feedback groups and be- 
tween the middle and low males in the no- 
feedback group. Future investigations 
might well wish to consider resultant 
achievement motivation group differences in 
error-correcting behavior after controlling 
the number of initially failed items to 
which each must attend. 

All resultant achievement motivation 
levels showed a significant advantage in re- 
ceiving feedback, but not by the same 
amount. For males, the medium resultant 
achievement motivation group would ap- 
pear to benefit the least in having feedback 
in that they showed the smallest difference 
in the percentage of failed items corrected 
on the retest (.605 minus .351). The low 
resultant achievement motivation group 
and the high resultant achievement motiva- 
tion group showed the largest differences 
(.647 minus .261, and .647 minus .304, re- 
spectively). For females it would appear 
that all resultant achievement motivation 
groups would benefit equally by having the 
feedback, in that all differences in the feed- 
back/no-feedback error-correcting percent- 
ages were approximately equal. This finding 
would support a practice of returning exam- 
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ination papers with correct answers to 
failed items noted, as a means of promoting 
retention and, possibly, subsequent learn- 
ing. 

Further research might investigate the ef- 
fect of feedback on the retest performance 
of those who do and do not exhibit the Zei- 
garnik effect (or tendency to recall incor- 
rect items) rather than by resultant 
achievement motivation groups as in the 
present study. Even though further analy- 
ses might show a significant relationship be- 
tween the Zeigarnik effect and error-cor- 
recting behavior, it would be of little use 
until we are able to identify those who will 
tend to exhibit the Zeigarnik effect. The 
present study suggests that an individual's 
motivation to achieve, as presently meas- 
ured, is not of much value in identifying 
whether or not he is able to make the 
proper recall and to engage in subsequent 
error-correcting behavior. 
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ASSOCIATIVE SYMMETRY AND SECOND- 
LANGUAGE LEARNING! 


LEONARD M. HOROWITZ; ann ALICE M. GORDON 


Stanford University 


Theoretical formulations in verbal learning have suggested a strate| 

for teaching a vocabulary list. Suppose the subject has to learn a rs 
of English-Japanese pairs: He could be taught the translations (eg. 
tobu-jump), and independently, the Japanese response words (eg., 
tobu). Then, according to the theory, the criterion pairs (e.g., jump- 
tobu) should automatically emerge. Two experiments were conducted 
to test this hypothsis. Together, they suggested that the subject can 
learn more efficiently by this method. However, it seemed to be 
critical that the two simpler tasks be intermingled. Otherwise, the 
subject seemed to forget one task while working on the other, and 


his overall efficiency was reduced. 


This paper examines the principle of as- 
Sociative symmetry and its implications for 
second-language learning. Two experiments 
are reported. When the principle was tested 
in its strictest form, the prediction was not 
confirmed. However, a revision of the hy- 
pothesis suggested an efficient way to teach 
English-Japanese vocabulary. 

According to the principle of associative 
Symmetry, an association between two 
items, A and B, is symmetrical: Under ideal 
conditions, A dams B with exactly the 
same strength as B elicits A. (See Asch & 
Ebenholtz, 1962; Ekstrand, 1966; Horowitz, 
1966; Kanek & Neuner, 1970.) First, sup- 
pose the two associates are equally availa- 
ble. (An item's "availability" tells how 
readily it comes to mind, or how salient it is 
aS a unit of speech.) According to the 
theory, the associates then elicit each other 
With equal probabilities; the forward 
Strength equals the backward. strength. 

If the two associates differ in availabil- 
ity, though, the less available tends to elicit 
the more available; the association seems to 
have a direction. For example, in paired-as- 
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sociate learning the subject usually antici- 
pates the responses out loud, so the re- 
sponses become more available than the 
stimuli. Then the less available stimulus 
(A) elicits the more available response (B), 
but not vice versa. 

This principle of associative symmetry 
has been supported under very simple con- 
ditions. Horowitz, Brown, and Weissbluth 
(1964), for example, prepared a paired-as- 
sociate task with six pairs of nonsense 
words. The stimuli of some pairs also hap- 
pened to be responses of other pairs; they 
therefore became highly available. (The 
stimulus of the pair B-C, for example, grew 
highly available because B was also the re- 
sponse of the pair A-B.) Subsequent free 
associations showed symmetry in the B-C 
associations: C elicited B as strongly as B 
elicited C. 

More recently, the principle has been 
elaborated to explain some cases of latent 
learning (Horowitz et al, 1968). Suppose 
the subject learns the paired associate A-B, 
with B growing more available. According 
to the extended theory, the B — A associa- 
tion is latent: To make it overt, one only 
needs to make A more available. If A could ' 
be made as available as B, the backward 
association B — A would emerge as 
strongly as À > B. 

Horowitz et al. (1968, Experiment 5) 
tested the claim this way. In Part 1 of their 
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experiment, the subject learned six pairs of 
nonsense words (A—B). Then in Part 2, the 
experimental procedure made each A item 
more available. The task required the sub- 
ject to produce the A items as often as he 
had produced the B items in Part 1. Then in 
Part 3 of the experiment, each original pair 
was tested for forward or backward recall: 
A- ——— or — -B. The results showed 
equally good forward and backward recall 
under this condition. 

This result may find practical application 
in second-language learning. Suppose a stu- 
dent learning Japanese has to learn hashi- 
ru-to run. If vocabulary were taught by the 
paired-associate method, the stimulus-re- 
sponse pair could be either hashiru — run or 
run — hashiru. Hashiru — run is easier 
since run is so much more available. But 
run — hashiru is really the goal of learning 
Japanese. The above experiment suggests 
one way to reach this goal: First teach the 
subject hashiru — run (the easier task). 
Then make hashiru more available, perhaps 
through a task that emphasizes correct 
pronunciation. Then the latent association 
run — hashiru should emerge automati- 
cally. 

According to a standard analysis, a 
paired-associate task can be analyzed into 
two ingredient tasks—associative learning 
and response learning: For one thing, the 
subject has to learn an association between 
each A and each B (e.g., between run and 
hashiru). By the principle of associative 
symmetry, though, the association can be 
established through A-B learning or 
through B-A learning. Furthermore, the 
subject has to learn the responses (e.g., 
hashiru). The responses can be learned 
through any task that makes the subject 
produce the Japanese words. The symmetry 
theory suggests, then, that associative 
learning and response learning are inde- 
pendent. Either task can occur first. 

Thus, a subject could learn hashiru-run 
(the translation task) and hashiru (the 
Japanese word) in either order. By associa- 
tive symmetry, run — hashiru should then 
emerge automatically. Whenever associa- 
tive learning and response learning occur, 
then, two results should follow: (a) The 
harder direction (run — hashiru) should 
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emerge without further training (unless the 
pretraining tasks undergo rapid forgetting) ; 
and (b) the total time it takes a subject to 
learn the two simpler tasks should equal the 
time it takes to learn the harder task di- 
rectly. In fact, the simpler tasks might even 
be learned faster because of motivational 
advantages and reduced interference. Ex- 
periment I tests these implications. 


EXPERIMENT I 


Method 


The subjects learned two pretraining tasks, then 
he transferred to a criterion task. The criterion task 
contained 12 pairs of items: jump-tobu, pound- 
utsu, run-hashiru, throw-nageru, kick-keru, shrug- 
usuru, lift-motsu, walk-aruku, sit-suwaru, clap- 
tataku, punch-tasuku, and scratch-kaku. In the 
final criterion task, every subject learned these 
pairs until he could anticipate each item correctly 
once. 

Pretraining. The two pretraining tasks are 
called associative pretraining and response pre- 
training. In each task the dropout method was 
used. Twelve pairs appeared one by one on 3 X 6 
inch index cards. Each card was placed behind a 
cardboard screen so that the stimulus appeared 
through an aperture in the screen. The stimulus 
appeared alone for 2 seconds, and then the ex- 
perimenter moved the card to expose the stimu- 
lus and response together for 2 seconds. A 4-second 
interval separated one pair from the next. 

Whenever a subject anticipated the response 
correctly, the card was removed from the deck. 
Tf he missed the item, it was left in the deck. 
Items which the subject missed were all shuffled, 
and the entire procedure was repeated on those 
items. Eventually the subject could anticipate 
every item once. Then the entire testing pro- 
cedure was repeated until he could correctly an- 
ticipate each of the 12 responses in succession. _ 

In associative pretraining, the subject studied 
the 12 pairs as Japanese-English translations. He 
saw every pair once for 2 seconds each. Then he 
was tested by the dropout method until he reached 
a criterion of one perfect trial. In response pretrain- 
ing, the subject learned to produce the 12 Japa- 
nese words; a minimal cue (a fragment of the 
Japanese word) occurred as the stimulus, and the 
subject had to generate the complete Japanese 
word. The fragment (eg, n — g — f —) aP- 
peared as a stimulus, and the subject had to pro- 
duce the complete word (nageru) as his response. 

The subject's score told how many item pres. 
entations were needed to reach the criterion of 
mastery. Each item presentation lasted 4 seconds 
so the number of item presentations is proportional 
to the total amount of time spent in mastering the 
task. x a 
Experimental conditions. Three experiment 
conditions were compared. Group AR Jearned the 
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associative learning task, then the response learn- 
ing task, and then the criterion task. Group RA 
learned the response learning task, then the asso- 
ciative learning task, and then the criterion task. 
Group C learned the criterion task only. 

Subjects. Forty-five subjects were tested alto- 
gether, 15 in each condition. The subjects were all 
students in the introductory psychology class at 
Stanford University, 


Results and Discussion 


Each task was scored to tell the total 
number of item presentations that a subject 
needed to master the task. The means are 
shown in Table 1. 

Overall efficiency in learning. The main 
question was whether the experimental pro- 
cedure was as efficient as the control proce- 
dure. Table 1 reports each condition's over- 
all mean, the total number of item presen- 
tations summed across all tasks. For Condi- 
tion AR, this mean was 247.47; for Condi- 
tion RA, 274.86; and for Condition C, 
167.93. These means differed significantly 
(F = 688, df = 2/42, p < .01). Simple 
contrasts showed that Condition C differed 
from each of the other conditions (p < 
001), but that the two experimental condi- 
tions did not differ from each other (p > 
-10). Thus, Method C, the direct approach, 
was the most efficient method. 

Pretraining tasks. The two experimental 
groups were also compared on associative 
learning. The mean for Group AR was 
125.40; that for Group RA, 104.93. The dif- 
ference was not significant (t = 1.22, df = 
28, > .10). 

Then the groups were compared on re- 
Sponse learning. The mean for Group AR 
was 85.40; for Group RA, 125.40. In this 
case, t = 3.06, df = 28, p < .01. Group 
AR’s superiority suggests either that 
Warm-up is particularly important for re- 
Sponse learning, or that prior associative 
learning can facilitate response learning. 

Criterion task only. When the criterion 
task is examined, the three groups are not 
strictly comparable. The criterion task was 
the control subject’s first task, but it was 
the experimental subject’s third task. Thus, 
warm-up was uncontrolled, and so were fac- 
tors related to fatigue. Still, the three crite- 
Tlon-task means were compared to help un- 
derstand the effect of pretraining. 
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The two experimental conditions showed 
enormous positive transfer on the final cri- 
terion task. The mean of Condition AR, was 
36.67; that of Condition RA, 44.53; and 
that of Condition C, 167.93. These means 
differed significantly (F = 42.07, df = 2/ 
42, p < .001). Expressed as a savings score 
relative to Condition C, the percentage of 
savings for Condition AR was 78.2%; for 
Condition RA, 73.5%. 

The hypothesis, however, implies more 
than positive transfer on the criterion task; 
it implies perfect transfer. If the compo- 
nents of paired-associate learning are really 
independent, and if associative symmetry is 
a valid principle, then the subject in either 
experimental group should, after pretrain- 
ing, know the criterion task items perfectly. 
But the subject needed more study time to 
master the criterion task. 

Two explanations are possible. For one 
thing, the pretraining tasks may not con- 
tain all the important elements of the crite- 
rion task. Perhaps the subject does not 
learn everything he needs to know from the 
pretraining tasks. 

Furthermore, the subject may have 
partly forgotten the first pretraining task 
while he was working on the second. A sub- 
ject in Condition RA, for example, may 
have forgotten some of his once-mastered 
Japanese words while he was learning the 
translations. According to this explanation, 
the general hypothesis might still be valid; 
but to demonstrate its validity, we would 
have to minimize the delay between each 
pretraining task and the criterion task. For 


TABLE 1 
Mean NUMBER or ITEM PRESENTATIONS TO 
Criterion ON EACH Task 


Task Group RA | Grep 
Pretraining 
i 104.93 "m 
add (Response 
learning) 
k 125.40 - 
Second tas Mu 
learning) 
Criterion task 44.53 1067.93 
tasks 
e combiné 274.86 — 167.93 
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example, the pretraining tasks might be in- 
terspersed: one trial of response learning, 
then one trial of associative learning, then 
one trial of response learning, and so on. 
Both kinds of pretraining would then occur 
just before the criterion task. 

Experiment II was designed to examine 
this second possibility. The pretraining 
tasks were intermingled in order to reduce 
the delay between each pretraining task 
and the criterion task. 


Exprriment II 


The experimental design was changed in 
several ways. For one thing, Experiment I 
had required that each pretraining task be 
mastered. A strict criterion may force a 
subject to spend time mastering unessential 
details, so the procedure as a whole may 
seem less efficient than it really is. (Ek- 
strand, 1966, and Kanek & Neuner, 1970 
have made similar suggestions.) In Experi- 
ment II, therefore, the subject spent less 
time on pretraining. 

Second, the criterion task was made to be 
comparable in all experimental conditions. 
Experiment II assigned a pretraining activ- 
ity to all subjects, so they were all equally 
practiced when they began the criterion 
task. 

Finally, the two pretraining tasks were 
intermingled. A trial of associative learning 
alternated with a trial of response learning. 
In this way, the delay was reduced between 
each pretraining task and the criterion task. 


Method 


"Three groups of 15 subjects were tested 
—an experimental group and two control 
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groups. They were all students in the class 
in general psychology at Stanford Univer- 
sity. 

The procedure for every group contained 
two basic components: a pretraining task ~ 
and a criterion task. The three groups dif- 
fered in pretraining only; the criterion task 
was the same for all subjects. The experi- 
ment’s design in summarized in Table 2. 
The design is illustrated with the represent- 
ative pair tobu-jump. Details of the design 
are explained below. 

Criterion task. The criterion task con- 
tained English-Japanese word pairs which 
were shown by the paired-associate method. 
The pairs were: jump-tobu, pound-utsu, 
run-hashiru,  throw-nageru,  kick-keru, 
shrug-yusuru, lift-motsu, walk-aruku, sit 
-suwaru. The pairs were presented on a La- 
fayette memory drum at a 1:1-second rate. 
An English word appeared for 1 second and 
the subject tried to anticipate the Japanese 
word. Then the Japanese word joined the 
English word for 1 second. A 1-second in- 
terval separated one pair from the next. 
The subject worked on this task for 12 
trials or to mastery, whichever came first. 
Four different orders of presentation were 
adopted; the order varied systematically - 
from trial to trial. 

Pretraining task. Every subject’s pre- 
training contained two kinds of activity 
which alternated from trial to trial. Pre- 
training as a whole lasted seven trials: Four 
trials of associative pretraining alternated 
with three trials of response pretraining. 
First the subject spent one trial on associa- 
tive pretraining, then one trial on response 
pretraining, then another trial on associa- 
tive pretraining, and so on, to make a total 


TABLE 2 
EXPERIMENTAL DESIGN oF ExPERIMENT II 
Task 
Experimental group (E) 

Pretraining 

Associative pretraining (Tı, T;, tobu-jump 

Ts, Tz) ek Group E) 
Response pretraining (Ts, T4, Ts) tobu tobu (like Group | jump 
E 

Criterion Task (12 trials or mastery) jump-tobu jump-tobu jump-tobu 


f; 
Condition 
Control group Chard Control group Censy 
jump-tobu tobu-jump (like 


of seven trials. Let us denote these trials T, 
through Ty. Associative pretraining oc- 
curred on T, Ts, Ts, and Tz; response pre- 
training on T», T4, and Te. 

Associative pretraining. Each subject of 
the experimental group was shown the Jap- 
anese-English translations. The pairs were 
identieal to those of the criterion task, but 
the direction was  reversed—Japanese- 
English pairs rather than English- 
Japanese pairs. The pairs were presented 
by the paired-associate method. As each 
Japanese word appeared, the subject tried 
to anticipate its English translation. The 
procedural details were identical to those of. 
the criterion task. 

Two contro] groups were also tested. 
These groups are called Chara and Cai: 
Group Ceasy was treated exactly like the 
experimental group in associative pretrain- 
ing. Group Chara had a harder associative 
task: their associative pretaining was ex- 
actly like the criterion task. That is, the 
pairs appeared in the English-Japanese 
direction during pretraining too. Thus, this 
group may seem to have had a three-trial 
head start; as the results will show, though, 
their subsequent performance was impaired. 

Response pretraining. During response 
retraining the experimental ^ subject 
learned to produce the Japanese words. 
Three different tasks were used so as not to 
bore the subject and to prevent irrelevant 
learning. On Trial Ts the subject read the 
nine Japanese words from index cards, then 
he saw each word tachistoscopically and 
tried to identify the J apanese word when it 
appeared. The word was flashed twice and 
each time it lasted 60 milliseconds. There 
were no errors on this task. 

On T, the subject saw anagrams of the 
Japanese words, Each anagram appeared on 
a 3 X 5 inch index card, and the subject 
tried to unseramble the letters to decipher 
the Japanese word. He was allowed 10 sec- 
onds for each word; then he was shown the 
correct response. 

Finally, on T, the subject saw a fragment 
of each word and he tried to identify the 
whole word. He received a sheet which con- 
tained the first letter of each word and was 
+. allowed 90 seconds to record the completed 
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words. Then, with his responses still before 
him, he saw a list of two-letter cues—the 
first two letters of each word. He added any 
further responses if he could. Finally, he 
saw a list of 3-letter cues, and again tried to 
add further responses. At the end of Te the 
subject was shown a complete list of all the 
Japanese words. The subject’s performance 
was scored to provide an estimate of his 
level of response learning. r 

Group Chara worked on this same set of 
tasks. For these subjects, T», Ta, and T, 
were identical to the experimental group's 
response pretraining. The other control 
group, Ceasy, did not have special practice 
on the Japanese words. During Trials Tə, 
Ta, and Te, they practiced the English 
words. That is, they read the English words, 
identified them tachistoscopically, deci- 
phered anagrams, and identified the words 
from fragments. Thus, Group C, had the 
advantage of distributed practice with op- 
portunity to rehearse associations to them- 
selves. But they had no explicit practice 
producing the Japanese words. 


Results and Discussion 


Pretraining. Associative pretraining for 
Group Ceasy was the same as for Group E. 
These two groups can therefore be com- 
pared to establish their comparability. 
Group E's mean number of correct antici- 
pations on Trials Ts, Ts, and Tz were .80, 
1.27, and 1.67, respectively. For Group 
Ca. the corresponding means were .80, 
1.80, and 2.13. An analysis of variance 
showed a significant effect of trials (F = 
5.47, df = 2/56, p < .01). The two condi- 
tions did not differ significantly, though (F 
= 2,45, df = 1/28), nor did conditions inter- 
aet significantly with trials (F = .37, df = 
2/56). Most important, the two means were 
identical initially (Trial Ts), so the groups 
do seem to be comparable. k 

(If associative pretraining was continued 
for more trials, Group Ceasy might eventu- 
ally have gained a significant advantage. If 
so, the advantage would probably be due to 
its easier response pretraining task. Utter- 
ing Japanese words may slightly interfere 
with the translation task. But after only 
three trials of associative pretraining, the 
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groups did not differ significantly in any 
way.) 

Group Chara performed more poorly in 
associative pretraining because of its harder 
task. The mean number of correct responses 
on trials Ts, Ts, and Tz were .27, .73, and 
1.18, respectively. This group's pretraining 
task was identical to the criterion task so 
these means will be helpful when each 
group's criterion performance is examined. 

(b) Response pretraining—Only one 
task of response pretraining was scored— 
the task of Te which required the subject to 
produce a word when a fragment served as 
the cue. The subject’s performance on this 
task was scored in two ways. One measure, 
the "stringent" score, told how often the 
subject responded correctly to a one-letter 
cue. The “lenient” score gave partial credit 
when he subsequently responded to a two- 
letter or three-letter cue. The lenient score 
was calculated this way: 1 point was scored 
for each correct response to a one-letter cue, 
Y point for each additional response to a 
two-letter cue, and 1⁄4 point for each addi- 
tional response to a three-letter cue. 

Response pretraining for Group Chara 
was the same for Group E. These two 
groups can therefore be compared to estab- 
lish their comparability. By the strict 
method, the mean for Group E was 5.40 and 
for Group Cia, 6.27. By the lenient 
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'method, the two means were 6.03 and 6.85, 
Neither pair of means differed significantly: 
for the former, t = 1.22, df = 28; for the 
latter, ¢ = 1.28, df = 28. Thus, the two 
groups are comparable in their perform- 
ances. (If the slight superiority of Group 
Chara has any significance at all, it can be 
explained this way: In associative pretrain- 
ing, that group had to produce Japanese 
words as responses; it may therefore have 
had a slight advantage. From a practical 
standpoint, it is interesting that three trials 
of effort on the difficult paired-associate 
task did not yield more response learning.) 

Response pretraining was very easy for 
Group Ceasy. This group only had to gener- 
ate the nine English words. For those sub- 
jects the mean score by the stringent 
method was 8.60 correct responses; by the 
lenient method, 8.83. Twelve subjects of the 
15 earned a perfect score. 

Criterion Task.—Now let us compare the 
three groups on the criterion task. Each 
group's performance is shown in Figure 1. 
For Group Chara, Trial 1 of the criterion 
task is really a continuation of associative 
pretraining; therefore, its means on Trials 
Ts, Ts, and T of pretraining are included 
for comparison. Figure 1 shows Group E's 
clear superiority. This group surpassed the 
other two groups throughout the 12 crite- 
rion trials. Furthermore, Group E's per- 
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Fig. 1. Mean number of correct anticipations on each trial of criterion task for each group. 
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formance on Trial 1 surpassed the perform- 
ance of Group Chara on any of its first 5 
trials (Ts, Ts, Tz, and Trials 1 and 2 of the 
eriterion task). 

An analysis of variance was performed 
on each group's number of correct re- 
sponses. According to this analysis, the con- 
ditions differed significantly (F = 5.75, df 
= 2/42, p < .01). The effect of trials was 
also significant (F — 29.02, df — 11/462, p 
> .001). But the interaction between trials 
and conditions was not significant (Fis 
1.51, df = 22/462, p > .05). 

The mean number of correct responses for 
Group E was 3.56; for Group Chara, 2.84; 
and Group Cas, 1.91. A t test was used to 
compare Group E with Group Chara: An- 
other ¢ test compared the two control 
groups. In both cases, the difference was 
significant at the .05 level. For the former 
comparison, t = 2.05, df = 28; for the lat- 
ter comparison, t = 2.24, df = 28. 

Correlations between tasks. For each task 
—response pretraining, associative pre- 
training, and the criterion task—every item 
was scored for its success. For example, 
consider a subject in Group E or Group 
Cas learning tobu-jump in associative 
pretraining. The score told how often sub- 
Jects anticipated jump correctly on trials 
Ts, Ts, and T». The corresponding score for 
Group Chara told how often subjects antic- 
Ipated tobu correctly. 

Two scores were computed for each item 
to describe the subjects’ performance on the 
criterion task. One score told how often the 
item was correctly anticipated on Trial 1 of 
the criterion task. The other score told how 
often the item was correctly anticipated 
throughout all 12 trials. In general, the 12- 
trial score correlated more highly with 
Other measures, so it may be the more relia- 
ble measure, 

The score in response pretraining told 
how often subject produced tobu when the 
One-letter fragment appeared (i-—-) for 
Group E and Group Chara- 

Scores on the various tasks were intercor- 
related, and the resulting rs appear in Table 

ese rs were computed separately for 
each condition. Response pretraining scores 
Cis; described the recall of the 
9 English words. Most of these scores were 
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TABLE 3 
INTERCORRELATIONS BETWEEN TASKS 
Task ego | Geer qum 
Associative pretraining vs. 
iponse pretraining .10 | .42 
sponse pretraining vs. 
Criterion task, Trial 1 .07 | .48 
Response pretraining vs. 
Criterion task, Total score | .04 .65 
Associative pretraining vs. 
Criterion task, Trial 1 .52 | .86* | .75** 
Associative pretraining vs. 
Criterion task, Total score | .81* | .89* | .91* 
*p« 01. 
** » « .001. 


For all other rs, p > .05. 


perfect, so rs were not computed for this 
measure. 

First consider the correlations for Group 
E. Performance in response pretraining was 
not significantly correlated with perform- 
ance in associative pretraining. Thus, the 
pair keru-kick, for example, was the easiest 
to translate, while the response keru was of 
medium difficulty. Apparently, items that 
are easy io translate are not necessarily 
easy to produce. Response pretraining also 
failed to correlate with criterion perform- 
ance. 

However, associative pretraining did cor- 
relate significantly with the criterion task. 
Items that were learned fast in associative 
pretraining were learned fast in the crite- 
rion task. Thus, the skills contained in the 
eriterion task seem to resemble those of as- 
sociative pretraining more than those of re- 
sponse pretraining. However, response pre- 
training was very important as the original 
data of Figure 1 showed. Perhaps those in- 
gredients of response pretraining that are 
important for the criterion task were not 
reflected in the measure of response pre- 
training that was adopted for these inter- 
correlations. i 

The Ceasy group learned the translation 
task, but not the Japanese words. For them, 
the correlations between associative pre- 
training and the criterion task were also 
significant—actually, higher than the corre- 
sponding rs for Group E. Apparently, the 
presence of Group E’s response pretraining 
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lowered the value of its rs. For example, 
nageru-throw was Group Cay, hardest 
pair in pretraining, and throw-nageru was 
its hardest pair in the criterion task. For 
Group E, though, nageru-throw was the 
hardest pair in associative pretraining, but 
after their practice producing nageru, the 
pair throw-nageru became easier; it rose to 
medium difficulty. Possibly, the experimen- 
tal group's advantage is that the response 
pretraining makes the previously hardest 
pairs no longer quite as hard. 

Finally, consider the Chara group. In 
both associative pretraining and response 
pretraining, these subjects had to produce 
the Japanese words. These tasks were some- 
what correlated, but the x did not reach 
statistical significance. For this group, asso- 
ciative pretraining was the very same task 
as the criterion task, and so associative pre- 
training really amounted to early trials on 
the criterion task. Therefore, a correlation 
between these tasks is really a correlation 
between early performance and later per- 
formance on the very same task. This corre- 
lation (associative pretraining versus the 
criterion task) was definitely high, but not 
much higher than the corresponding corre- 
lation for Group E. This fact again high- 
lights the general importance of associative 
pretraining to the criterion task. 


CoNCLUSION 


The two experiments, then, can be sum- 
marized this way. According to the princi- 
ple of associative symmetry, an English 
-Japanese vocabulary list can be taught ef- 
ficiently in an indirect way: The subject 
can learn the associations in the easier 
direction and independently learn the Japa- 
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nese response words. Then, he should auto- 
matically know the pairs in the harder 
direction. Experiment I did not confirm this 
hypothesis. It seems to have failed mainly 
because the two simpler tasks were sepa- 
rated in time; a subject seemed to have for- 
gotten one task while learning the other. 

In Experiment II, though, the two sim- 
pler tasks were intermingled. Under this con- 
dition, a subject did learn very efficiently 
—in fact, faster than comparable control 
subjects. Therefore, it seems that subjects 
can learn English-Japanese pairs efficiently 
by an indirect method. The experimental 
arrangement thus seemed to enhance learn- 
ing; perhaps it reduced the amount of inter- 
nal interference, and perhaps it made the 
tasks more interesting. Further research is 
still needed to determine the exact reasons 
for the facilitation and to discover the opti- 
mum arrangement for pretraining. 
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EXPLORATION OF THE EFFECT OF DENSITY AND 
SPECIFICITY OF INSTRUCTIONAL OBJECTIVES 
ON LEARNING FROM TEXT 


E. Z. ROTHKOPF! AND 
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Intentional and incidental learning was studied as a function of (a) 
the density in the text of sentences relevant to instructional objec- 
tives and (b) the specificity with which instructional objectives were 
described. The major findings were: (a) more intentional learning re- 
sulted from specific rather than broad objectives, but incidental 
learning was not affected by this factor; and (b) increases in density 
of instructional objectives resulted in decreases in the likelihood 
that any intentional item was learned, but did not affect performance 
on incidental items. Intentional learning was generally greater than 
incidental. Performance on both intentional and incidental items 
was considerably higher when instructional goals were explicity de- 
scribed than when directions similar to those commonly employed 
in learning experiments were used. 


Providing explicitly stated objectives to 
students prior to instruction has been shown 
to increase the effectiveness of training 
(Mager & McCann, 1961). Supplying stu- 
dents with objectives in this way is analogous 
to the use of directions in Type II incidental 
learning studies (see Postman, 1964, p. 187). 
In these studies, intentional learning is de- 
fined in terms of the materials that are 
Televant to directions that have been given 
to a subject prior to training. 

The present experiment was an attempt 
to explore the use of instructional objectives 
as directions that describe the relevant in- 
Structional content in written discourse to 
Subjects. The factors that were of primary 
interest were the density of relevant infor- 
mation in the text and the specificity with 
which the objectives were described in the 
directions, 

Density was defined as the proportion of 
Sentences in a text that were relevant to at 
—— 


Z * Requests for reprints should be sent to Ernst 

5 Rothkopf, Learning and Instruction Research 

Apartment, Bell Laboratories, 600 Mountain 
enue, Murray Hill, New Jersey 07974. 

ng We are indebted to M. J. Billington for her 
uable help in analyzing the data. 


least one instructional objective. This factor 
is of both practical and theoretical interest. 
High-density text resembles the condition 
encountered in text that has been tightly 
edited from an instructional point of view, in 
the sense that most instructionally irrelevant 
information has been removed. Since editing 
and rewriting are time consuming and re- 
quire skill, the efficiency of high-density 
configurations is an important practical 
question. 

A somewhat more theoretical issue is 
raised by the possibility that high-density 
mapping of objectives on text may limit 
the likelihood that any single objective is 
achieved. Some evidence in support of this 
conjecture has been obtained by Poulson 
(1958), in an experiment in which inspection 
time was controlled by the experimenter. 
The first purpose of the present experiment 
was to determine whether this finding holds 
for situations in which subjects control their 
own inspection time and whether incidental 
learning is also affected by density. 

A second purpose of this study was to 
explore the role of specificity of objectives 
(directions) in determining both intentional 
and incidental learning. In the sense of the 
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present experiment, generally stated ob- 
jectives describe learning goals by use of 
class names for the objects or properties that 
the subject is directed to learn about. Each 
generally stated objective is relevant to 
Several sentences in the text. Specific ob- 
jectives, on the other hand, explicitly name 
each object or property to be learned. Each 
specific objective is relevant to only one text 
sentence, while a broad objective is relevant 
to several. Specific objectives or directions 
are greater in number than the equivalent 
general directions and are therefore more 
cumbersome and less desirable from a prac- 
tical point of view. 


METHOD 
Experimental Scheme 


The two major experimental factors were (a) 
specificity in the description of instructional ob- 
jectives, that is, in the phrasing of directions to 
learn; and (b) the density of relevant sentences in 
the text. Relevant sentences were those that were 
empirically determined to be relevant to one of the 
Objectives. Density was the ratio of relevant 
sentences to the total number of sentences in the 
text. The effects of these two factors on intentional 
and incidental learning was explored in a factorial 
experiment. 


Materials 

Three experimental passages were selected from 
textbooks developed by the System Training 
Department at Bell Telephone Laboratories.’ 
The passages were respectively 842, 1,091, and 
1,120 words in length and composed of 60, 56, and 
55 sentences each. They contained instructional 
material on printing designs for forms, on business 
systems, and on system development. The mate- 
tials had relatively loose sequential organization. 
Objectives 

A set of specific and general objectives was 
prepared for each passage. Each objective con- 
sisted of a single sentence or phrase describing a 
learning goal. A specific objective was phrased so 
as to be relevant to exactly one sentence in the 
passage. A general objective was relevant to 2-5 
sentences. 

The set of specific objectives for each passage 
consisted of several subsets, each comprising 
from 2 to 5 topically related objectives. Each 
topically related subset of specific objectives was 


* We are grateful to Mr. F. L. Stevenson, Head, 
Systems Training Department for kindly allowing 
us to use these materials. 
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relevant to adjacent sentences in the passage. A 
topically related subset of specific objectives was 
equivalent to one general objective in that they 
were both relevant to the same group of adjacent 
sentences in the passages.‘ Relevance was deter- 
mined by empirical procedures that will be de- 
scribed later. The total number of specific state- 
ments of objectives prepared for each of the three 
passages was 33, 30, and 37, respectively. The total 
number of corresponding general objectives avail- 
able for the three passages was 12, 9, and 9, respec- 
tively. 

A series of preliminary studies was conducted 
in order to assure that the identical set of passage 
sentences was judged relevant to matching spe- 
cific and general objectives. This was done by 
requesting the subjects, in an preexperimental 
study, to read a given objective, to underline all 
sentences in the passage that were relevant to it, 
and to label the sentences with the appropriate 
objective number. 

The first tryout involved two groups of 10 sub- 
jects each. Both groups read all three passages. 
One group of subjects read two passages using 
specific objectives and one passage using general 
objectives; the second group had the reverse 
assignment of directions to passages. In this way, 
a mapping of the general and the specific objectives 
was obtained on the sentences of each passage 
(n = 10 for each objective-passage combination). 
A sentence was considered relevant if 9 out of 10 
subjects assigned it to the same objective. When 
disagreement occurred, the objective and/or the 
passage was rewritten. All objectives were re- 
administered on each subsequent tryout: Six 
tryouts with different groups of 20 subjects were 
required to attain agreement. 


Density 

The subsets of topically related, specifie ob- 
jectives were divided into three groups for each 
passage. The three groups were chosen so that the 
sentences relevant to each group were approxi- 
mately equal in number and were fairly evenly 
distributed throughout the experimental passage. 
The three groups of objectives for each passage 
were labeled A, B, and C. The A, B, and C groups 
were relevant to 10, 12, and 11 sentences in the 
first passage, and to 10, 10, 10, and 14, 12, and 11 
sentences, respectively, in the other two passages. 
The number of sentences assigned to the three 
groups of objectives (A, B, C), as well as to the 
universally incidental condition, is summariz 
in Table 1. i 

Three density levels of relevant sentences in 


* The following is an example of a general ob- 
jective: Learn about the physical appearance of the 
three kinds of type faces discussed! The matched E 
of specific objectives was: (1) Learn about t | 
physical appearance of Gothic type! (2) Learn abou 
the physical appearance of Italic type! (3) od 
about ihe physical appearance of Roman type 
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TABLE 1 
AssIGNMENT OF SENTENCES TO OBJECTIVES 
GROUPS ror THE THREE EXPERIMENTAL 


PASSAGES 
Passage 
Objectives group 

1 2 3 

Universally in- 
tentional (A) 10 10 14 
Mixed (B) 12 10 12 
Mixed (C) 1 10 11 

Universally in- 
cidental (D) 23 22 14 
Unassigned 4 4 4 
Total 60 56 55 


each passage were produced. They will be referred 
to as the 20%, 40%, and 60% density levels, al- 
though these percentages are not exact. The 
method for producing the three experimental 
levels of densities in the three passages is sum- 
marized in Table 2. Density 20% was achieved by 
using Objective Group A, either in the general or 
specific form. Density 40% involved Objective 
Groups A and B. Objective Groups A, B, and © 
were used to achieve the 60% density level. The 
three density levels involved 10, 22, and 33 relevant 
sentences out of 60 sentences for the first passage; 
10, 20, and 30 out of 56 sentences for the second 
passage; and 14, 26, and 37 relevant sentences out 
of 55 in the last passage. In this way, three density 
levels, exactly matched for relevant sentences in 
the general and specific objectives treatment, were 
produced for each passage. It should be noted that 
the term density was used more for convenience 
than descriptive accuracy. Three potentially 


TABLE 2 
METHOD FOR ACHIEVING THE THREE DENSITIES 
OF RELEVANT SENTENCES IN THE THREE 
EXPERIMENTAL PASSAGES 


Number of relevant 
Density* Objectives group Bruns 
1 2 3 
A pba anes 
20 A 10 | 10 | M 
40 A+B 22 20 26 
60 A+B+C | 33 | 30 | 37 


Total sentences per passage | 60 e is 
onis pntanges per pasesgoj alt Mu ND 


* Stated densities are approximate percentages. 
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important factors were confounded in this var- 
iable and the experiment was not designed to 
distinguish among them. They were: (a) the num- 
ber of objectives presented to the subject; (b) the 
number of relevant sentences in the text; and (c) 
the ratio of relevant sentences to the total number 
of sentences in the text. 


Tests 


Short-answer test questions were constructed 
for each sentence relevant to an objective by 
removing one key substantive word from each 
sentence and substituting for it a line of uniform 
length. In addition, similarly constructed test 
questions were developed for most of the re- 
maining sentences which were not relevant to any 
objective (universally incidental). The number of 
universally incidental sentences for which test 
questions were constructed was 23, 22, and 14 
sentences, respectively, per passage. There was a 
total of 56, 52, and 51 test items per passage. The 
test questions, used for each passage, were pre- 
sented in three different random orders in the 
experiment. 


Design 

A3X 2X3 X 2 factorial design was used. The 
factors were: (a) three passages; (b) two levels of 
objectives (specific, general); (c) three levels of 
density (20%, 40%, and 60%); and (d) two kinds 
of learned performance (intentional versus inci- 
dental), with repeated measures on the same 
subjects for the last factor. Twenty-one subjects 
were assigned to each of the 18 treatment com- 
binations. 

One additional reference treatment (n = 21) 
was used for each passage. The reference groups 
studied the experimental text with the very broad 
directions to learn “everything” in the text. This 
treatment corresponds to the direction usually 
employed in learning experiments and will be 
referred to as the CLD (Conventional Learning 
Directions) treatment. 


Procedure 
The subjects received three manila envelopes 
containing, respectively: (1) a set of objectives 
(directions) and a passage; (2) the short-answer 
test on the assigned passage; and (3) some reading 
material to occupy the subject while other sub- 
jects completed their experimental task. Each 
envelope also contained written secondary di- 
rections about the use of the enclosed materials. 
The subjects in the main experimental treatments 
were instrueted that they would be tested only on 
the information in the passage relevant to the 
stated objectives. However, they were tested on 
almost every sentence in the passage, thereby 
itting testing of intentional (relevant to an 
objective) and incidental (nonrelevant) learning. 
The subjects were permitted to view the directions 
describing the objectives while reading the pas- 
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sage. The objectives were arranged on the di- 
rection page(s) in the same sequence as the cor- 
responding relevant sentences in the text. Care 
was taken to assure that all subjects worked 
through the three envelopes in the proper order. 
Study and test time was controlled by the subject. 


Subjects 

Paid volunteers (V = 120) served as subjects in 
the preexperimental study to determine the num- 
ber of sentences relevant to each direction. They 
were undergraduate students at Rutgers Univer- 
sity, New Brunswick, New Jersey. 

Paid volunteers (V = 441) from three New 
Jersey high schools (Scotch Plains-Fanwood, New 
Providence, and Summit) were used in the main 
experiment.’ They consisted of 206 males and 
275 females ranging in age from 14 to 19 years. 
The experiment was conducted at each high school 
shortly after the last school period. 


RESULTS 


The test items for each passage were 
divided into four groups according to their 
relevance to the three density levels for 
learning objectives (see Table 1). Item Cate- 
gory A was derived from sentences that 
were relevant under all three density levels; 
Category B was derived from sentences 
relevant to only the 40% and 60% density 
levels; Category C came from sentences that 
were relevant in the 60% density condition 
only, and items in Category D came from 
sentences that were not relevant under any 
of the three densities of objectives. Items in 
Category A will be referred to as universally 
intentional and Category D as universally 
incidental. This was done for brevity and to 
contrast these items with those in Categories 
B and C which were intentional in some 
conditions and incidental in others, 


Universally Intentional and Incidental T tems 


A summary of this analysis for universally 
intentional (Category A) and for universally 
incidental items (Category D) is shown in 
Figure 1. For the purpose of clarity, the data 
in the graph have been averaged across the 
three passages. A 3 X 3 X 2 X 2 factorial 
analysis of variance was performed on these 


ë We wish to thank the principals of these three 
high schools for their generous help in obtaining 
subjects and in providing experimental space. 
They are P. H. Tyson of Scotch Plains-Fanwood; 
W. M. McCarthy, of New Providence; and D. R. 
Geddis of Summit. 
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data. The factors were: passage (3), density 
(3), specificity of objectives (2), and kind of 
learning (2, i.e., intentional versus incidental 
items) with repeated measures on the last 
factor. Arc sine transformations were applied 
to proportion of correct responses. This 
analysis supports the following conclusions: 

1. Intentional learning was greater than 
incidental learning (F = 368.62, df = 
1/360, p < .001). 

2. Specific objectives resulted in higher 
performance on intentional items than gen- 
eral objectives. Specificity of objectives had 
little or no effect on incidental learning. The 
main effect due to specificity of objectives 
(F = 12,461, df = 1/360, p < .001) and the 
interaction between specificity and kind of 
learning (F = 38.486, df = 1/360, p< 
-001) were significant. 

3. Increases in density were accompanied 
by decreases in the proportion of intentional 
items that were correctly recalled. There 
were no measurable effects of density on 
incidental learning. Main effects due to 
density were not significant (F = .851, df = 
2/360), but the interaction between density 
and kind of learning was (F = 9.422, df = 
2/360, p < .001). The triple interaction, 
Density X Objectives X Kind of Learning, 
was not significant (F = 2.514, df = 2/300, 
05 < p < .10). Comparison of experimental 
densities under specific objectives, using the 
Newman-Keuls method indicated the follow- 
ing: Density 60% resulted in significantly 
lower performance on universally intentional 
items than Density 20% (q = 6.182, df = 
3/360, p < .01). Differences between Den- 
sities 20 % and 40 76 (g = 1.944, df = 2/360; 
i = 1.37, df = 360) and between Densities 
40% and 60% (g = 1.948, df = 2/360) were 
not significant. None of the comparisons 
among various densities of instructional ob- 
jectives for the general direction treatments 
was significant, nor were any of the com- 
parisons among various densities for per- 
formance on universally incidental items. 


Comparison with the Conventional Learning 
Directions Reference Group 

As was expected, performance on uni- 
versally intentional items (Category A) was 
substantially higher in the experimental than 
the CLD reference group. This is clear from 
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Fra. 1. Average proportion correct responses for any inten: 


test item, as function of the density of instructional objecti 
al learning directions reference group (CLD). 


of describing the objectives and a convention 


(The intentional and incidental data are based 


Spectively [see text].) 


an inspection of Figure 1. More noteworthy 
was the finding that the experimental treat- 
ments also resulted in higher performance on 
firemalty incidental items (Category D) 
an the CLD reference condition. The later 
poston was tested for individual passages 
x t comparisons, with the data combined 
or the general and specific directions in the 
int experimental treatments. For Passage 
E t = 1.79, df = 379,.05 < p < .10; for 
poe 2,t = 5.22, df = 379, p < .001; and 
or Passage 3, t = 3.03, df = 379, p < 01. 


Total Test Performance 


„The previous analysis indicated that 
Es density of instructional objectives in 
ext resulted in smaller likelihood that any 


tional and incidental learning 
ves in text, for two methods 


on the test items of Groups A and D, re- 


given intentional item was correctly an- 
swered. This does not mean, however, that 
the total test performance for high-density 
treatments will be less than for lower densi- 
ties. This is because high-density treatments 
have a higher number of objective-relevant 
items in text than the low-density treat- 
ments. Hence, a larger proportion of the 
total number of objectives were inspected 
under the intentional condition in the higher 
density treatments. Total test performance 
is shown for the various treatments in 
Figure 2. The plotted data are the mean 
proportions of all items on the test that 
were correctly answered. The data have 
been combined over all three passages. The 
results indicated that overall performance 
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Fia. 2. Proportion of all test items resulting in a correct response, as function of the 
density of instructional objectives in text, for specifically and generally described ob- 


jectives. 


increases with the number of instructional 
objectives that were provided for subjects. 
Specific objectives resulted in higher per- 
formance than general objectives. These 
conclusions were supported by a 3 X 2 X 3 
analysis of variance. The factors were pas- 
sage, specificity, and density. Arc sine trans- 
formations were used on the proportions of 
correct responses. All three factors were 
significant: for passages (F = 38.22, df = 
2/360, p < .001), for directions (F = 5.91, 
df = 1/360, p < .05), and for density (F = 
7.88, df = 2/360, p < .01). As can be seen 
in Figure 2, specific objectives resulted in 
higher performance than general objectives. 
Using the Newman-Keuls technique, Den- 
sity 60% resulted in significantly higher 
performance than Density 40 % (q = 2.80, 
df = 2/360, p < .05), which in turn pro- 
duced significantly higher total test scores 
than Density 20% (q = 2.817, df = 2/360, 
p < .05). 


Discussion 


The most salient finding of the experiment 
was the large effects on learning produced by 


providing instructional objectives to the sub- 
ject before exposure to the text. This result 
has been frequently reported in Type Il 
laboratory studies of incidental learning. 
The substantial practical implications of 
these findings have been recognized by Deese 
(1964, p. 206) and Mager and McCann 
(1961). The present observations serve to 
draw further attention to the usefulness for 
the schools of the simple technical practice 
of providing explicit instructional goals for 
students. 4 
The main substantive findings of this 
experiment were that (a) density increments 
resulted in reduction of the likelihood that 
any single intentional item was learned but 
did not affect incidental learning; and (b) 
specifically stated objectives produced more 
intentional learning than more generally 
stated objectives. In this connection, i 
fact that our experimental setting did no 
allow us to collect data on inspection time I5 
unfortunate because such observations would 
have allowed some insight into the relative 
efficiency of the various experimental treat- 
ments. From a practical point of view this 
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omission is not serious because, in most 
instructional situations, inspection time is 
under the control of the students. The 
instructional value of any treatments has to 
be considered in terms of overall effects on 
learned performance regardless of details of 
the interesting relationships among treat- 
ment, inspection time, and learned perform- 


ance. 

Poulton (1958) has reported results that 
are consistent with the present findings that 
the high density of objectives decreased the 
likelihood that any given objective was 
mastered. He found, in a paced reading 
situation, that a density of 18% resulted in 
markedly better intentional learning than a 
density of 100%. The largest difference be- 
tween the two densities was found when rate 
of presentation was slow (37 words/minute). 
This condition appears closest to the self- 
paced conditions of the present experiment 
in which subjects controlled their own study 
pace. 

The variable referred to as density has 
been called so for convenience only and 
without prejudging its key characteristics. 
At least three factors were confounded in 
density in the present experiment. These 
were: (a) the number of objectives presented 
to a subject, (b) the number of relevant 
sentences in the text, and (c) the ratio of 
relevant sentences to total number of sen- 
tences in the text. The present experiment 
was not designed to distinguish among these 
factors. Distinction among the effective fac- 
tors behind the density variable is closely 
related to an interesting, theoretical ques- 
tion: Were the effects of increasing density 
on intentional performance due to difficulties 
in information acquisition or were they due 
to characteristics of memory processes? One 
information acquisition hypothesis is that 
Some factor associated with increased den- 
sity, such as the longer list of objectives, 
Takes it less likely that a relevant sentence 
in the test will be found and appropriately 
decoded. Another information acquisition 
hypothesis is that increases in density result 
in less discriminating selection from the text, 
that is, greater attention to inappropriate 
text elements. Memory hypotheses, on the 
other hand, attributed the density effect on 
Intentional learning to increased interference 
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that results from the greater number of 
items in memory or to increased difficulty in 
retrieval from memory. 

The experiment provided no definitive 
basis for choosing between two alternative 
explanations. However, à secondary finding 
was not consistent with a simple memory 
hypothesis. This was the failure to find any 
density effects for performance on incidental 
items. If memory limitations had been re- 
sponsible for the density effect, incidental 
memory should have been affected in the 
same way as intentional memory. 

The observation that specifically stated 
Objectives produced higher performance 
poses several interesting questions. Was the 
result due to the nature of the list of objec- 
tives? Or was it a consequence of better 
ability to quickly reject the irrelevant sen- 
tences in the passage? The possibility cannot 
be ruled out that generally phrased direc- 
tions have both advantages and disadvan- 
tages compared to the more specific objec- 
tives, but that the disadvantages had 
stronger effects in this experiment. The ad- 
vantage may be due to the more compact 
form of the general objectives and the disad- 
vantage due to greater difficulty in recogniz- 
ing relevant sentences. The relative magni- 
tude of these effects under various conditions 
is unknown. The possibility exists that the 
relative effectiveness of broadly stated objec- 
tives increases as the number of instruc- 
tional objectives grow larger. à 

The performance on incidental learning 
items requires some comments. Incidental 
learning was not influenced by specificity of 
objectives. This cannot be interpreted to 
mean that incidental learning is never 
affected by the nature of directions such as 
instructional objectives. First, such a con- 
clusion was contradicted by the finding that 
explicitly stated objectives produced higher 
performance on incidental items than the 
vague general directions that are usually 
used in learning experiments (i.e., the CLD 
reference group). Second, previous experi- 
ments id a those of Frase (1969) have 
shown that search directions have a pro- 
found influence on incidental learning. These 
effects of directions and other similar inter- 
ventions on incidental learning have been 
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called positive mathemagenic effects by 
Rothkopf (1970). They are hypothesized to 
be due to more effective inspection and 
processing of the text. 

Finally, the findings that density and 
specificity of directions did not affect inci- 
dental learning is reassuring. It suggests that 
carefully specified instructional objectives 
will not interfere with the serendipitous 
discovery of information not directly rele- 
vant to instruction. 
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RETROACTIVE INHIBITION OF PROSE AS A FUNCTION OF 
THE TYPE OF TEST: 
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Retroactive inhibition of textbook-style prose was demonstrated using 
& short-answer test and, especially, using a multiple-choice test which 
included the specific competing response from the interpolated pas- 
sage. No retroactive inhibition was detected on multiple-choice tests 
in which the distractors entailed only responses from the original pas- 
sage or noncompeting responses from the interpolated passage. The 
results were in close agreement with interference theory and the find- 


ings of paired-associate research. 


. The determinants of retroactive inhibi- 
tion are clearly specified in the interference 
model of forgetting. Although interference 
theory may not explain all of the phenom- 
ena of forgetting, it has stood the tests of 
time and empiricism, and forgetting in the 
psychological laboratory regularly occurs as 
interference theory predicts (cf. Adams, 
1967). Yet most experiments which have at- 
tempted to demonstrate retroactive inhibi- 
tion with prose have failed. One thesis of 
the present study is that these experiments 
failed to find retroactive inhibition with 
prose because they did not closely approxi- 
mate the specifications of the interference 
model of forgetting. 

Among the necessary conditions for retro- 
active | inhibition are: (a) corresponding 
stimuli in the original learning and interpo- 
lated learning must be similar while the as- 
Sociated responses vary; and (b) the reten- 
tion measure must differentiate items on the 
basis of the similarity of stimuli and re- 
Sponses. When the similarity between origi- 
mal learning and interpolated learning has 
een taken into account, and when test 
items have paralleled this comparison, ret- 
T 
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roaetive inhibition has usually been demon- 
strated with prose materials. When these 
factors have not been considered, retroac- 
tive inhibition has usually not been de- 
tected. 

In the latter case, for example, Ausubel, 
Stager, and Gaite (1968) and Ausubel, 
Robbins, and Blake (1957) used prose ma- 
terials but failed to differentiate informa- 
tion according to the types specified in in- 
terference theory. It is likely that these ex- 
periments failed to detect retroactive inhi- 
bition because the passages and tests con- 
tained about equal numbers of response- 
different and response-same items, which, in 
the final tally, would balance each other, 
leaving a net effect of no retroactive inhibi- 
tion. Wong (1970) attempted to specify 
similar stimuli across original learning and 
interpolated learning via paragraph head- 
ings while changing the content of the para- 
graphs from original learning to interpo- 
lated learning. The failure to detect retro- 
active inhibition in this experiment could 
have been due to failure to discriminate be- 
tween response-same and response-different 
items on the retention test. 

Anderson and Myrow (1971) reported 
two studies wherein the stimulus and re- 
sponse terms were specified according to the 
interference model. The stimuli were simi- 
lar. Where the responses were the same, fa- 
cilitation occurred ; where different, retroac- 
tive inhibition occurred. Contrary to expec- 
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tation there was more retroactive inhibition 
when a recognition measure (multiple- 
choice test) was used than when a recall 
measure (short-answer test) was used. Ex- 
amination of the errors made by the group 
receiving similar interpolated learning re- 
vealed that those subjects chose the inter- 
polated-learning responses much more often 
than their counterparts who received dis- 
similar interpolated learning. These results 
were seen as evidence for response competi- 
tion, that is, confusion among responses 
which the subject had available and which, 
under some conditions, he was capable of 
giving to the stimulus. 

Research has seemed to indicate that re- 
sponse unavailability is the chief factor in 
retroactive inhibition with paired asso- 
ciates. For instance, Postman and Stark 
(1969) found marked retroactive inhibition 
in the A-B, A-C paradigm on a recall test, 
whereas they obtained no retroactive inhi- 
bition on a multiple-choice test. The dis- 
crepancy was attributed to response una- 
vailability, which could not affect perform- 
ance on a recognition measure since the re- 
sponses are furnished. However, Postman 
and Stark did not include the specific com- 
peting  interpolated-learning responses 
among the alternatives in the recognition 
test. Such a procedure effectively ruled out 
response competition, as did the associative 
matching procedure employed in several 
other studies because either only original- 
learning responses were provided (Me- 
Govern, 1964) or the list membership of the 
responses was indicated (Postman, Stark, & 
Fraser, 1968). 

Anderson and Watts (1971) investigated 
retroactive inhibition with paired associates 
using one of three types of unpaced multi- 
ple-choice test. When each test item in- 
cluded the specific competing interpolated- 
learning response there was a substantial 
amount of retroactive inhibition. No retro- 
active inhibition appeared when test items 
entailed only original-learning responses or 
noncompeting interpolated-learning re- 
sponses. An analysis of errors revealed that 
the competing interpolated-learning re- 
sponse was selected from two to four times 
as often as other distractors. Postman 
(1952) has reported similar results. 
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The general purpose of the present study 
was to reaffirm the findings of Anderson and 
Myrow (1971) that retroactive inhibition of 
prose can follow interference theory predic- 
tions. The specific purpose was to determine 
whether various types of multiple-choice 
tests are differentially sensitive to retroac- 
tive inhibition of prose in a way parallel to 
the differential sensitivity of such tests to 
retroactive inhibition of paired associates, 


Mernop 


Materials 


The original learning and related interpolated 
learning passages were 2,240-word and 2,190-word 
passages respectively, each describing a fictitious 
African tribe. These were the passages employed 
by Anderson and Myrow (1971) in Experiment I. 
About one-third of the information in each passage 
was analogous to the response-same items de- 
scribed above. For example, the religious practices 
of both tribes allowed only the members of & 
deceased tribesman's clan to prepare the tribes- 
man's body for cremation. Another third of the 
information in each passage was unrelated to the 
other passage; for example, the Himoots spent a 
great deal of time as foresters while the Gruandas' 
main activity was hunting. The remaining informa- 
tion in each passage was analogous to the response- 
different mode described in the introduction. For 
example, both tribes had a complex clan system, 
but the Himoots' system was based on occupation, 
while the Gruandas’ was based on the various 
stars. The information varied in kind from specific 
(names of food) to general (religious practices). In 
most cases where proper nouns were required, 
paralogs of about equal association value (Noble, 
1952) were assigned to the two passages. 

The unrelated interpolated learning passage 
(control) was the one on Zen Buddhism used by 
Ausubel et al. (1968). T 

The Farr, Jenkins, and Petterson revision of 
the Flesch Reading Ease Formula (Klare, 1963) 
was used to compare readability of the passages. 
The uncorrected reading levels were 6.9 for the 
original learning passage, 6.6 for the related inter- 
polated learning passage, and 6.7 for the unrelated 
interpolated learning passage. 

Three multiple-choice tests and one short- 
answer test were developed to assess the role of 
response competition in retroactive inhibition. 
Ten items assessed retention of response-same in- 
formation in the passages, 10 assessed retention of 
neutral information, and 14 items assessed reten- 
tion of response-different material. 4 

The three multiple-choice tests differed only in 
the response alternatives for the response-different 
items; the stems were identical. For the specific- 
both group, the alternatives for the response- 
different items consisted of the correct ori 
learning response, the corresponding specific intet- 


RETROACTIVE INHIBITION OF PROSE 


polated learning response, and one nonspecific 
distractor each from original learning and inter- 
polated learning. Alternatives for the nonspecific- 
both group were the same as for the specific-both 
group except that an additional nonspecific re- 
sponse from interpolated learning was substituted 
for the specific competing response on each re- 
sponse-different item. The alternatives for the 
response-different items for the original learning- 
only group consisted of the correct answer and 
three distractors from the original passage. Dis- 
tractors for response-same items and neutral items 
were identical for all multiple-choice tests. Alter- 
natives for these items included the correct answer, 
and distractors in roughly equal distribution from. 
both original learning and related interpolated 
learning. Scores for the multiple-choice tests were 
corrected for guessing by the formula, R-W/3. 

The short-answer test presented the item stems 
followed by blank spaces. These tests were scored 
on the basis of a predetermined key which speci- 
fied the limits of acceptable answers. 

An 1l-item questionnaire asked students about 
their study strategies, while reading original learn- 
ing and interpolated learning, and about their 
interest in the materials. 

To measure verbal ability, the Educational 
Testing Service Wide-Range Vocabulary Test 
ce, Ekstrom, & Price, 1963) was admin- 
istered. 


Design 


There were three between-subject variables and 
one within-subject variable in the experiment. Be- 
tween-subject variables were: type of interpolated 
learning (related or unrelated) ; test type (specific 
both, nonspecific both, or original learning only, 
or short answer) ; and three levels of verbal ability. 
These factors were completely crossed. The within- 
Subject variable was item type, response same, 
neutral, or response different. 


Procedure 


On Day 1, the subjects first took the verbal 
ability test and then read the original learning 
Passage, Answers to the vocabulary test were 
Tecorded on Digitek answer sheets. Subjects were 
instructed to read the original learning material 
py in preparation for a test that would cover 
E the general concepts and specifie details of 

e passage. When they finished studying the 
dius the students wrote down the time on 
f € Digitek sheets which they then brought to the 
Tont of the classroom. The experimenter or an 
assistant checked to make sure that the times were 
ete accurately, When a large clock was not 
cl early visible to all students, the experimenter Or 
92 assistant wrote the time on the chalkboard at 
dde intervals. The subjects were eni 
to study the materials carefully. Consequently, the 
stude recording was down-played by informing the 
disi ek that “We're not interested in any M- 

'vidual’s speed, but in how much you learn.” 
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On Day 2, the Digitek answer sheets were 
returned to students. Each person also received 
either the related or the unrelated interpolated 
learning passage. Assignment to the interpolated 
learning group was random with approximately 
equal distribution within each section. Subjects 
coded the answer sheets according to interpolated 
learning type, read instructions identical to those 
for original learning, and proceeded to study the 
interpolated learning passages, record the time, 
and bring forward the passages, exactly as on 
Day 1. 

After reading the interpolated learning passages, 
the subjects filled out the questionnaire, again 
recording answers on the Digitek blank. A few 
students were surprised that the “test” did not 
more explicitly cover the learning materials. The 
experimenter attempted to ignore the queries 
while leaving the impression that the question- 
naire items were “the questions” about the 
passages, a ruse designed to discoverage rehearsal 
or discussion of the learning materials before the 
forthcoming retention test. However, it was learned. 
later that several of the teachers had inadvertently 
told the students that the experimenter would 
“return for more testing next week.” 

On Day 8, the Digitek sheets were returned to 
the students inside the tests. The four types of 
tests were distributed in approximately equal 
proportions throughout each section, and assigned 
to subjects randomly. After the tests were admin- 
istered and all materials collected, the experi- 
menter explained the experiment, 

The subjects were allowed up to about 25 
minutes to read each passage and to take the 
appropriate test. They were allowed 12 minutes 
to take the vocabulary test. 


Subjects 

One hundred and seventy-four students in 
American history sections of a suburban Chicago 
high school served as subjects. Most of the stu- 
dents were sophomores. The experiment was con- 
ducted during regular school hours, 


RESULTS 


Table 1 displays the mean percentages 
córrect on the three types of test items for 
the four different test groups and catego- 
rized by the type of interpolated learning. 


Effects of Interpolated Learning and Item 
Type . à 

It was expected that if retention were 
tested with either a short-answer test, or 
with'a multiple-choice test which included 
competing interpolated-learning responses 
among the alternatives for response-differ- 
ent items, then retroactive inhibition would 
occur as predicted by the interference 
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TABLE 1 
UNWEIGHTED MEAN PERCENTAGE Correct ON Tests OF ORIGINAL LEARNING 
" Test type 
Trepi nente 
MC-SB MC-NSB MC-OLO SA 
(n = 20) (n = 26) (n = 20) (n = 20) 
Related Response same 41.0 46.2 47.9 32.1 
Neutral 41.3 43.1 52.8 30.4 
Response different 33.6 43.4 54.4 20.3 
(n = 21) (n = 23) (n = 23) (n = 21) 
Unrelated Response same .2 37.5 35.1 26.8 
Neutral 47.6 44.7 55.2 38.7 
Response different 53.1 44.7 46.6 26.5 


Note.—Abbreviations: MC-SB = multiple choice specific both; MC-NSB = multiple choice non- 
specific both; MC-OLO = multiple choice original learning only; SA = short answer. 


theory of forgetting. To test this prediction, 
an unweighted means analysis of variance 
was computed for the combined specific- 
both and short-answer test groups. There 
was no overall effect, of interpolated learn- 
ing. The predicted Type of interpolated 
learning X Item Type interaction (F — 
422, df = 2/140, p < .02) is pictured in 
Figure 1. The retroactive inhibition on re- 
sponse-different items fits the interference 
model. The expected facilitation for the re- 
lated-interpolated-learning group on re- 
sponse-same items was not clearly demon- 
strated, although the means were in the pre- 
dicted direction. A decrement on neutral 
items for the related-interpolated-learning 
group is predicted by the interference model 
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Fig. 1. Mean percentage correct of combined 
multiple-choice specific-both and  short-answer 
groups as à function of type of interpolated learn- 
ing. 


(cf. Adams, 1967, p. 81). In the present 
case, there was slight, but nonsignificant, 
nonspecific interference on these items. 


Effects of Test Type 


Several planned comparisons were made 
io check the sensitivity of the multiple- 
choice tests to retroactive inhibition on the 
response-different items. It was expected 
that the interference due to the related in- 
terpolated-learning passage would be mani- 
fest only when the test included the specific 
competing response from the interpolated- 
learning passage. Accordingly, performance 
on the specific-both test for the related-in- 
terpolated-learning and unrelated-interpo- 
lated-learning groups was compared and 
found different (t = 2.70, df = 39, p < .01). 
Next comparisons were made between the 
mean percentages correct on response-dif- 
ferent items for the three types of multi- 
ple-choice test taken by the subjects who 
received the related interpolated-learning 
passage. Performance was significantly 
lower on the multiple-choice specifie-both 
than on the multiple-choice original learn- 
ing only test (t = 2.88, df = 44, p < .01); 
and lower but not significantly so than on 
the multiple-choice nonspecific-both test (t 
= 1.36, df = 44). Newman-Keuls tests re- 
vealed no further significant (a = .05) dif- 
ferences among groups. 


Analysis of Errors 


Table 2 categorizes the types of errors 
made on response-different items. If re- 
sponse competition were to play an impor- 
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TABLE 2 


Maan NUMBER OF CHOICES PER DISTRACTOR ON 
RzsPoNsE-DIFFERENT ITEMS 


Response choice. 

Type of inter- 

polated learning | 7° '9P® — Spe- ee EN 

IL | OL 

Related MC-SB 3.55| 1.55 1.08 45 
Related MO-N8B | — | 1.78] 2.23| .75 
Related MC-OLO | — | — | 1.58| .60 
Unrelated MC-SB 1.88| 1.24| 2.14) .62 
Unrelated MC-NSB | — | 1.75) 2.50) .50 
Unrelated MC-OLO | — | — | 1.94) .17 


Note.—Abbreviations: MC-SB = specific 
both; MC-NSB = nonspecific both; MC-OLO = 
original learning only; IL = interpolated learning; 
OL = original learning. 


tant role in retroactive inhibition, it would 
be expected that the errors of subjects who 
received related interpolated learning would 
be characterized by frequent selection of 
the specific competing responses from inter- 
polated learning. The subjects who had re- 
ceived related interpolated learning chose 
the specific competing interpolated-learning 
responses about two and one-half times 
more frequently than subjects who had re- 
ceived unrelated interpolated learning. Fur- 
thermore, the specific competing interpo- 
lated-learning responses were chosen more 
than twice as often as either noncompeting 
interpolated-learning responses or original- 
learning responses. These results replicate 
the findings of Anderson and Myrow (1971) 
and Anderson and Watts (1971). 


Subject Study Methods 


As in the Anderson and Myrow (1971) 
study, the subjects generally said they stud- 
i. the original-learning passage more care- 

ully than the interpolated-learning pas- 
e The time data corroborated the ques- 
lonnaire responses. For all groups, fewer 
words were covered per minute during origi- 


m learning than during interpolated learn- 
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Effect of Verbal Ability 


Verbal ability was significantly related to 
performance on the criterion tests (F = 
11.91, df = 2/150, p < .01). Verbal ability 
did not interact with any of the other fac- 
tors under investigation. 


Discussion 


Retroactive inhibition was shown in this 
experiment to be a viable phenomenon in 
meaningful prose memory when care was 
taken to closely approximate the conditions 
of the interference model of forgetting. To 
demonstrate retroactive inhibition in prose 
learning it was necessary to specify the 
points of similarity and difference between 
original learning and interpolated learning, 
to differentiate test questions along these 
points of contrast, and to include poten- 
tially interfering interpolated-learning re- 
sponses in the range of alternatives for re- 
sponse-different items. Given these controls, 
retroactive inhibition followed the predic- 
tions of the interference model. When re- 
sponses to the same stimuli were identical 
for original learning and interpolated learn- 
ing, retroactive facilitation occurred; when 
stimuli and responses were unique to each 
passage, there was no effect; and when re- 
sponses to the same stimuli were different 
between original learning and interpolated 
learning, retroactive inhibition was gener- 
ated. 

Further evidence was accumulated for 
the position that much of the effect of retro- 
active inhibition is due to response competi- 
tion. Postman and Stark (1969) have deem- 
phasized specific associative interference, 
contending instead that retroactive inhibi- 
tion is due in the main to general suppres- 
sion of original-learning responses. Had this 
been true in the present study, achievement 
measured by the multiple-choice nonspe- 
cific-both and multiple-choice specific-both 
tests should have been identical. It was not. 
More retroactive inhibition was detected 
when the specific competing responses were 
included among the alternatives. 

Furthermore, for the group receiving re- 
lated interpolated learning, analysis of the 
errors on the response-different items 
showed that the specific competing interpo- 
lated-learning passage distractors were cho- 
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sen more than twice as often as either the 
noncompeting interpolated-learning or orig- 
inal-learning distractors. The evidence sug- 
gests that competition between specific 
original-learning and interpolated-learning 
responses accounts for a great deal of the 
forgetting ascribable to retroactive inhibi- 
tion. 

Observers of the classroom may wonder 
how frequently in the “real world” forget- 
ting analogous to retroactive inhibition ac- 
tually occurs, How often does it happen 
that the preconditions to retroactive inhibi- 
tion—similar stimuli paired with different 
responses—coincidentally appear in ordi- 
nary classroom activity? We seldom teach 
students different answers to the same ques- 
tion. If retroactive inhibition is generated 
in prose only when the materials are so 
closely similar, we must question the 
efficacy of the interference model as an in- 
clusive explanation of forgetting in the 
classroom. The atomistic approach that was 
required to make good the analogy between 
paired-associate retroactive inhibition and 
prose retroactive inhibition seems at once 
hecessary and potentially misleading. It 
seems probable that students do forget 
without the presence of closely similar con- 
fusing material. Although proponents of in- 
terference theory believe that nonspecific 
interference accounts for a good deal of for- 
getting, there was little or no evidence for 
nonspecific interference in the study re- 
ported in this paper. The present experi- 
ment did reaffirm that retroactive inhibition 
can occur with realistic prose materials. But 
it is yet to be shown whether interference 
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theory explains all of the phenomena of for- 
getting or only some of them. 
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"EFFECTS OF ENCODING CUES ON PROSE LEARNING! 
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Two experiments investigated the influence of encoding cues on prose 
ways learning. The encoding-cue conditions were designed to produce en- 
M coding of information required by the output test, In Experiment I 
fi, the conditions of encoding did not influence learning, and this result 


held over two study intervals and three stages of practice. In Experi- 
ment II underlining encoding cues greatly facilitated learning, and the 
facilitation was larger for fast- and medium-learning subjects than 
slow-learning subjects. The results supported the conclusion that en- 
: coding cues will facilitate prose learning when they result in encoded 
we information required by output which would not otherwise be encoded, 


"The nature of the present educational 
ays demands that students learn from 
prose materials. Indeed, a case could be 
made that a student’s ability to learn from 
prose materials will largely dictate his suc- 
cess or failure in many activities far beyond 
those of the educational system. During 
prose learning, the reader may be thought 
of as encoding or processing the text. These 
Activities during input may result in memo- 
Ties for words, phrases, images, topic sen- 
tences, central themes, temporal informa- 
tion, and so on. As yet there is no clear way 
to conceptualize these memories. It seems 
Teasonable, however, that they are quite 
omplex and may even include, for example, 
Paraphrases of the text, or further examples 

Í concepts presented by the text. This 
memory information becomes the reader's 
Tecord of the events of input (cf. Under- 
wood, 1969). 

Many study-skill recommendations such 
as making an outline, writing & précis, note 
taking, and underlining may be potentially 
among the most useful ways to facilitate 
Prose learning, However, research has 
Tather consistently shown only a very small 
ee 

This research in part by United 
States Public beside Grant MH- 
Me from the National Institute of Mental 
p. Beueste for reprints should be sent to James 
D Crouse, College of Education, University of 
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or nonexistent increase of output when 
these procedures are followed (cf. Arnold, 
1942; Idstein & Jenkins, 1972; Stordahl & 
Christensen, 1956), and the source of this 
failure is the central concern of the present 
experiments. These study-skill procedures 
may be thought of as influencing the infor- 
mation that is encoded. Within the present 
perspective, they should facilitate output 
when they result in encoded information re- 
quired by output which would not otherwise 
be encoded. This condition probably was 
not met in the studies just mentioned, since 
the readers governed their own outlining, 
précis writing, etc., and there was no appar- 
ent relationship between these activities 
and the requirements of output. In the pres- 
ent experiments subjects were given various 
encoding cues which were systematically re- 
lated to the requirements of output. The 
output test always comprised questions 
whose answers consisted of one or more 
words taken directly from the passage read. 
EXPERIMENT I 

Experiment I consisted of four input con- 
ditions. The first condition was a control 
(Condition C) in which a passage was sim- 
ply studied during input. The remaining 
conditions were designed to influence encod- 
ing by presentation of encoding cues. In the 
learn answers condition (Condition LA), 
input was like Condition C except that the 
parts of the passage which answered the 
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output questions were underlined. The sub- 
jects were told to learn these underlined 
parts because each one was an answer to a 
question that would be asked at output. 
The intent of this treatment was to maxi- 
mize encoding of the target information re- 
quired for correct output. In the generate 
questions condition (Condition GQ), input 
was like Condition LA except that subjects 
were also told that they should try to think 
of the output question each underlined part 
answered, and to try to remember the un- 
derlined part as the answer to that ques- 
tion. The purpose of this treatment was to 
influence storage of the information con- 
tained by the output questions. Finally, in 
the read questions condition (Condition 
RQ), input was like Condition GQ except 
that each of the output questions was ac- 
tually presented following the underlined 
part in the prose passage. The subjects were 
told that they should try to remember each 
underlined part as the answer to the ques- 
tion which followed in the text. The intent 
of this condition was to maximize storage of 
the information contained by the output 
questions. It was expected that correct out- 
put would increase over Condition C, LA, 
GQ, and RQ, respectively. It also seemed 
that the effects of these input conditions 
might well depend on certain other varia- 
bles, and two of these were studied in this 
experiment. The first was the study time 
given for a passage, and the second was the 
number of prose passages that were learned. 


Method 


Passages. Three fairly short passages which 
were highly factual in content were used: (a) & 
212 word passage about a hypothetical person 
named John Payton; (b) a word passage about 
a hypothetical island named Karisoon ; (c) a 214 
word passage about a hypothetical library named 
King Library. A set of 22 questions was generated 
for each passage. Each question could be answered 
unambiguously by one or more from the appro- 
priate passage. There was little to no similarity 
among the three passages as defined by each set 
of questions being different and having different 
answers. 

Design. The four input conditions (C, LA, GQ, 
and RQ) were combined factorially with two study 
intervals (2.5 and 5.0 minutes) to form eight con- 
ditions. Eighteen subjects, undergraduate students 
at the University of Delaware, were assigned 
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to each of these conditions. Each subject learned 
the three passage described above. The order in 
which the three passages were learned within 
each condition was counterbalanced by a Latin. 
Square arrangement; that is, six subjects learned 
the passages in the Payton, Karisoon, King 
Library order, six subjects learned the passages in 
the Karisoon, King Library, Payton order, and 
six subjects learned them in the King Library, 
Payton, Karisoon order. 

Procedure. The subjects were run in small 
groups of one to three which were assigned to 
a randomized list of conditions. The materials 
were combined into booklets and the learning 
procedure was as follows. During input, the 
passage was presented on a single page cither for 
25 or for 5.0 minutes. In Condition C, subjects 
were told to learn as much as they could in the 
time provided. In Condition LA, each of the an- 
swers to the 22 output questions was underlined 
and subjects were told to learn those underlined 
parts as each would be an answer to a question they 
would be asked. In Condition GQ, subjects were 
also told to try to think of the output question 
each underlined part answered and remember the 
underlined part as the answer to that question, Fi- 
nally, in Condition RQ, the output question that 
each underlined part answered was presented at the 
end of the sentence in which the underlined part 
occurred. The subjects were told that they should 
try to remember the underlined parts as answers 
to these questions since these questions would 
be asked at output. Immediately following input, 
the 22 output questions were presented on à 
single page and each subject printed his answers. 
The order of the questions on the page was 
scrambled; that is, they did not follow the se- 
quence of the passage. Three minutes were given 
for output. Following learning of the first, passage, 
the second passage and then the third passage Was 
learned in succession. About 2 minutes intervene 
between output of one passage and input of the 
next one, and this time was filled with a brief 
review of the learning instructions for the con- 
dition. 


Results 


The number of correct answers given by 
each subject on each passage was computed 
and these data were subjected to an analy- 
sis of variance. The mean number of correct 
answers in the C, LA, GQ, and RQ input 
conditions was 15.84, 17.09, 16.81, and 
16.85, respectively. While performance was 
numerically higher in Conditions LA, er 
and RQ than in Condition C, the babe 
= 1.70, df = 3/120, p > .05, was not signi- 
icant. 

It was mentioned earlier that the ate 
of the input conditions might depend on the 
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study interval used or the number of pas- 
sages learned. The pattern of the data for 
the input conditions at each study interval 
is shown in Table 1. It may be seen that 
Conditions LA and GQ are higher than the 
control with the 2.5-minute study interval, 
but Conditions LA, GQ, and RQ are higher 
than the control with the 5.0-minute study 
interval. These differences, however, are 
very small and the interaction just barely 
reaches the .05 level of statistical signifi- 
cance (F = 3.05, df = 3/120). The most 
impressive finding in Table 1 is the large 
effect of study interval (F = 53.32, df = 
1/120, p < 01). 

The pattern of results for the input con- 
ditions at each stage of practice is shown in 
Table 2. This pattern shows somewhat 
greater improvement with successive pas- 
sages in Conditions LA, GQ, and RQ than 
the control, but these differences are also 
small and the interaction is not significant 
statistically (F = 2.05, df = 6/240, p > 
05). The most apparent finding in Table 2 
is the effect of number of passages learned 
which shows primarily an increase in per- 
formance over the first two passages (F = 
5.17, df = 2/240, p < .01). 

The results shown in Tables 1 and 2 indi- 
cate that there was little variation among 
the input conditions, and this pattern held 
Over the two study intervals and the three 
levels of practice of the present experiment. 
Two further analysis of the data were com- 
pleted: (a) The effect of input conditions 
Was examined for the slow learners (the 
nine subjects within each condition with the 
fewest number of correct answers over all 
three passages) and the fast learners (the 
nine subjects within each condition with the 


TABLE 1 
Mray Numer or Corrncr Answers ror EACH 
Inpur Conprtron AT EACH STUDY INTERVAL 


Study interval 
Condition ILLL——— 
2,5 minutes | 5.0 minutes 
Control 14.35 | 17.83 
Learn answers 16.11 | 18.07 
Generate questions 15.70 | 17.92 
Read questions 14.24 | 19.46 
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TABLE 2 
MEAN NUMBER or CORRECT Answers ror EACH 
Inpur Conprtion AT EACH 
Stace or PRACTICE 


Passage number 
Condition 
1 2 3 


Control 

Learn answers 
Generate questions 
Read questions 


14,58 | 17.22 | 15.72 
15.50 | 17.94 | 17.88 
15.50 | 17.08 | 17.86 
15.33 | 17.47 | 17.75 


greatest number of correct answers over all 
three passages). Again there was little vari- 
ation among the input conditions, and this 
held for slow and for fast learners since all 
Fs which jointly involved input conditions 
and ability were not significant (all ps > 
05). (b) The effect of input conditions was 
examined for the difficult questions (the six 
questions given correctly the fewest number 
of times in each condition on the first pas- 
sage) and the easy questions (the six ques- 
tions given correctly the greatest number of 
times in each condition on the first pas- 
sage). Still again there was little variation 
among the input conditions, and this held 
for difficult and for easy questions since all 
Fs which jointly involved input conditions 
and question difficulty were not significant 
(allp > .05). 


Discussion 

The input conditions had little apparent 
effect on output performance. There are 
several possibilities to account for this find- 
ing, and the following represents one of 
them. It is possible that while the encoding 
cues produced encoding of information 
needed for output, this same information 
was also encoded in the control condition; 
that is to say, the encoding cues did not 
produce encoding of information required 
by output that was not otherwise encoded. 
While there is not direct evidence that this 
was the case, there are several aspects of 
the experiment that make it plausible. Ex- 
periment II changed these features so that 
encoding cues are more likely to shift the 
subjects’ effective study time to encoding 
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information required by output that is not 
encoded in the control condition. 


Exprrment Il 


Several changes were made. First, pas- 
sage length was increased from approxi- 
mately 220 words to approximately 6,000 
words so that in the absence of encoding 
eues (the control condition) the subjects' 
effective study time would be widely dis- 
tributed to encoding information through- 
out a larger amount of text material than in 
Experiment I. Second, there were encoding 
cues for 22 output items in Experiment I, 
which amounted to about one item for 
every ten words of text. In Experiment II, 
however, there were encoding cues for 30 
output items, which amounted to only 
about 1 item for every 200 words of text. 
With fewer output items relative to the 
amount of text in Experiment II, encoding 
cues seemed more likely to shift the sub- 
jects' effective study time to encoding infor- 
. mation required by output that was not 
also encoded in the control condition. Third, 
only one type of encoding cue was used; 
namely, underlining. The parts of the pas- 
sage which identified information contained 
by the output questions were underlined as 
were the parts which identified information 
needed to answer the questions. Finally, 
each subject was given only a single pas- 
Sage to learn; thus performance changes 
over successive passages were not studied. 


Method 


Passage. The reading passage consisted of the 
introduction to Educational and Philosophical 
Thought (Price, 1967), and was about 6,000 words 
in length. Thirty questions were constructed which 
could be answered by one or more words taken 
directly from the passage, and these questions 
were used for output. 

Design. Two input conditions were employed; 
@ control in which the passage was presented and 
subjects were simply instructed to read the 
material as if they were studying for the test, and 
an underlined condition. In the underlined con- 
dition, the parts of the passages which identified 
information contained by the output questions 
were underlined as were the parts of the passage 
which identified information needed to answer 
the questions. For example, the following ques- 
tion was presented as output: “Who defined 
education in its broadest sense as social continuity 
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of life?” The following sentence was underlined in 
the passage: “In the broadest sense, John Dewey 
says, education is the means of social continuity 
of life.” In the underlined condition, the subjects 
were told to read the material as if they were 
studying for a test, that the underlined material 
contained answers to the questions they would 
be asked later, and that they should concentrate 
on this underlined material. 

Procedure. The materials for each of the two 
conditions were combined into booklets, and 33 
subjects, undergraduate students at the University 
of Delaware, were randomly assigned to each con- 
dition, All of the subjects in both conditions 
served in the experiment at the same time. The 
instructions for the subjects were printed in the 
booklets. Immediately after reading the instruc- 
tions, subjects were given 25 minutes to study the 
passage, and immediately following this they were 
given as much time as they needed to write their 
answers to the 30 output questions. 


Results 


The mean number of correct answers in 
the control group was 9.24 and in the un- 
derlined group was 17.30. Therefore, under- 
lining resulted in about an 87% improve- 
ment over the control without underlining, 
For purposes of further analysis, the 33 
subjects in each condition were ranked from 
the greatest to least number of correct an- 
swers. Within each condition the first 11 
subjects were called fast learners, the next 
11 medium learners, and the next 11 slow 
learners. The mean number of correct an- 
swers for the fast, medium, and slow learn- 
ers in each condition is presented in Table 
3. The improvement resulting from under- 
lining was highly significant statistically (F 
= 175.4, df = 1/60, p < .01), the effect of 
ability was significant (F = 116.5, df = 
2/60, p < .01), and the interaction of un- 
derlining and ability was significant (F = 


TABLE 3 
Meran Number or Correct Answers For FAST, 
MEDIUM, AND Stow LEARNERS IN EACH 
Input CONDITION 


Input condition 
Learners ———— m 
Control Underlined 
Fast 14.18 23.45 
Medium 8.45 18.72 
Slow 5.09 9.72 


Tiei Ss eaaa 


PROSE LEARNING 


820, df — 2/60, p « .01). The pattern of 
results in this interaction shows that the im- 
provement resulting from underlining is 
much greater for the fast and medium sub- 
jects than the slow subjects. 


Discussion 

The results of Experiment II indicate 
that encoding cues can result in substantial 
facilitation of output performance in prose 
learning. Similar results using underlining 
were also recently reported by Cashen and 
Leicht (1970). The facilitation found in Ex- 
periment II was greater for fast- and medi- 
um-learning subjects than slow-learning 
subjects. These results are quite consistent 
with the conclusion that when underlining 
cues direct study time to encoding of infor- 
mation required by output, which is not oth- 
erwise encoded, then output performance 
will be increased. Because fast- and medi- 
um-learning subjects may be able, in a 
manner of speaking, to encode more in a 
constant study time than slow subjects, the 
concentration of study time on the under- 
lined parts of the text would be expected to 
benefit them more than slow subjects. 
à The large effects of underlining in Exper- 
iment II, and also the Cashen and Leicht 
(1970) study, are quite in contrast with the 
small or null effects found by others when 
subjects do their own underlining (e.g., Id- 
stein & Jenkins, 1972; Stordahl & Christen- 
sen, 1956) ; nevertheless, such results are to 
be expected for at least two reasons. First, 
in those studies there is no reason to think 
that underlining shifts a subject’s study 
time to encoding information that is not en- 
coded in the control condition. It may be 
that in the underlining conditions, subjects 
simply underline what they would study 
even if no underlining was required. Second, 
in the studies where subjects do their own 
underlining, there is no reason to believe 
that underlining results in subjects encoding 
information which is required by the output 
test; that is, there is no systematic relation- 
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ship between an analysis of output and 
what was underlined. 

Finally, the present findings lend addi- 
tional support to interpretations of prose 
learning which have their point of depar- 
ture in the processing or encoding of infor- 
mation into memory (cf. Anderson, 1970; 
Frase, 1970; Matz & Rohwer, 1970; Roth- 
kopf, 1970). By these interpretations, the 
activities in which the reader engages dur- 
ing instruction are of crucial importance. 
One task of instructional control or educa- 
tional engineering is to discover the condi- 
tions which insure the occurrence of appro- 
priate encoding activities. The present re- 
sults suggest that, under certain circum- 
stances, underlining may serve as one of 
these conditions. 
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FACTOR-ANALYTIC AND CRITERION STUDY OF 


ACHIEVEMENT ORIENTATION? 


ROY C. HERRENKOHL? 
Lehigh University 


The research examined the conception of achievement motivation as 
a unitary construct. The development of a 160-item self-description- 
type instrument designed to assess achievement orientation was re- 
ported. This instrument was administered to 5,102 high school and 
college students, and responses to the items on the instrument were 
factor-analyzed. Ten dimensions selected as offering the greatest 
likelihood of replication, were described. The degree of association be- 
tween selected demographic and personal variables and scores on the 
achievement-orientation dimensions were examined, as well as the 
association between the achievement-orientation scores and the two 
criterion variables, grade average and educational aspirations. The 
conclusion was that achievement orientation is multidimensional in 
nature, and that such a conception can aid understanding of achieve- 


ment-oriented behavior, 


In the preface to the first major presen- 
tation of research on achievement motiva- 
tion, McClelland, Atkinson, Clark, and 
Lowell (1953) described their decision to 
study only one motive because such an ap- 
proach allowed more concentrated study of 
human personality. However, the history 
of research on achievement motivation 
Since then gives striking evidence that the 
definition and measurement of even one 
motive is a complicated task. The accumu- 
lation of research since that early work has 
tended to confuse and obscure rather than 
simplify and clarify. 
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To a degree, the difficulty is associated 
with the thematic apperception method of 
measuring achievement motivation, pre- 
sented in the 1953 work of McClelland et 
al. This method is scored by a content 
analysis procedure which  subdivides 
achievement imagery into one of several 
categorics, but arrives at a single achieve- 
ment-motivation score by summing the 
frequencies of content for all categories. 
The use of several subcategories implies the 
presence of several psychological constructs, 
but the procedure for arriving at an 
achievement-motivation score denies this 
implication. Consequently, use of this pro- 
cedure has led to a history of confusion 
about the construct or constructs being 
identified by the Thematic Apperception 
Test (TAT) achievement-motivation scor- 
ing system devised by McClelland et al. . 

Briefly, one can trace the major points in 
this controversy. Soon after the publication 
of the scoring system, the question was 
raised as to whether TAT need-for-achieve- 
ment scores of different magnitudes repre- 
sented different motivational dynamics 
Atkinson (1957) suggested that there was & 
difference between persons with high scores 
and persons with low scores. The former 
were characterized by motivation to achiev? 
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and the latter by motivation to avoid fail- 
ure. Shortly thereafter, Atkinson and Lit- 
win (1960) differentiated both conceptually 
and operationally between motivation to 
achieve and motivation to avoid failure, 
using the Test Anxiety Questionnaire (Man- 
dler & Sarason, 1952) to measure failure 
avoidance. Atkinson has continued to assess 
motivation to achieve by the TAT method 
or by the French Test of Insight (French, 
1958). 

A further challenge to the singularity of 
the achievement-motivation construct came 
when the relevance of the McClelland et al. 
scoring system to both sexes was questioned 
(French & Lesser, 1964; Lesser, Krawitz, 
& Packard, 1963). In essence, the question 
raised was whether there were sex-specific 
dimensions of achievement motivation. 
More recently, the motivational conception 
itself has been challenged by Klinger and 
MeNelly (1969) who contend that TAT- 
assessed need-for-achievement reflects so- 
cial status. These authors further suggest 
that there are four achievement-related 
dispositions differentially associated with 
social status. 

The difficulty, however, does not rest 
solely with the TAT. A review of other 
methods of measuring achievement motiva- 
tion, such as the Personal Preference Sched- 
ule (Edwards, 1954), the Test of Insight 
(French, 1958), or the Iowa Picture Inter- 
pretation Test (Hurley, 1955), also sug- 
gests that there may be more than one con- 
struct involved in achievement motivation. 
First, insofar as tests such as the Test of 
Insight or the Picture Interpretation Test 
are scored by variations of the McClelland 
et al. scoring procedure, two or more con- 
structs, comparable to those involved in 
TA'T-assessed motivation, may be present. 
Second, the low correlations characteris- 
tically found between different measures of 
achievement motivation suggest the pres- 
ence of more than one construct. Low cor- 
relations such as those found by Himel- 
stein, Eschenbach, and Carp (1958), Mar- 
lowe (1959), Atkinson and Litwin (1960), 
and Heckhausen (1968), may be due to 
low reliability of measures, to differences 

etween projective and self-description-type 
Measures, or to the assessment of different 
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constructs by the various measures. The 
latter is a compelling explanation since 
even different self-description measures 
possessing acceptable reliability do not tend 
to correlate with each other to any sig- 
nificant degree. Third, there is more direct 
evidence that achievement motivation is 
not unitary. Mitchell (1961) demonstrated 
that using a variety of achievement-oriented 
measures with female subjects, one could 
identify several meaningful factor analyt- 
ically derived dimensions. By the same 
method, Mehrabian (1968) demonstrated 
that there are separate achievement-ori- 
ented dimensions for males and for females. 

The present study was undertaken to 
determine whether what is referred to as 
achievement motivation consists of several 
constructs rather than one or two. Another 
purpose was to develop a measurement in- 
strument that has better psychometric 
properties and is easier to administer and 
score than previous measures. 

The report of the present study is in two 
parts, A factor-analytic study (Part 1) to 
define achievement-orientation dimensions, 
and a criterion study (Part 2) which ex- 
amines the relation between scores from the 
factor analytically derived dimensions and 
selected criterion variables. 


Parr 1: FACTOR-ANALYTIC STUDY OF 
ACHIEVEMENT-ORIENTATION 
DIMENSIONS 


Procedure 

Initially, a pilot version of the achievement- 
orientation instrument was developed. A set of 
140 items representing different categories of con- 
cern about success or failure was written. The 
items were based on content identified in the exist- 
ing measures of achievement motivation. In addi- 
tion, an attempt was made to identify persons 
who have no reactions to success and failure by 
including items which make it possible to deny 
such feelings. These items were suggested by 
Collins (1968) who has demonstrated that some 
persons characteristically deny presumably unac- 
ceptable feelings in situations where one would 


expect these feelings to be experienced, 
rd Rie in the form of statements to which 
a respondent answered true or false, depending on 
whether he considered the statement descriptive 
of himself. ; 
The resulting instrument was pilot-tested, 
being administered to 690 college undergraduates 
whose responses were factor-analyzed. A 10- 
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factor solution was considered the most meaning- 
ful. With these results as a guide, the measure 
was revised by dropping those items that had 
relatively low absolute factor loadings and adding 
to each dimension other items believed to reflect 
more closely the focus of the items with higher 
loadings. By this method, 40 items were dropped 
from the original 140, and 60 were added, resulting 
in a 160-item revised measure. The revised instru- 
ment was then administered to a new sample of 
anonymous respondents. 


Subjects 

The sample of 5,102 respondents consisted of 
2,774 high school students and 2,328 college stu- 
dents, 54% and 46% of the sample, respectively. 
High school students were from nine publie high 
schools in either metropolitan or urban areas of 
Pennsylvania and Maryland. College students 
were from 16 colleges in all geographical areas of 
the country. The colleges were both publically 
and privately supported, had both coeducational 
and single-sex student bodies, were both 2- and 
4-year colleges, and ranged in selectivity of admis- 
sions from very high to no selectivity beyond 
completion of high school. Table 1 presents a 
tabulation of several characteristics of respond- 
ents. The sample is relatively balanced between 
college students and high school students with 
considerable diversity of socioeconomic back- 
ground. One weakness of the sample, viewed from 
the standpoint of reflecting diversity of back- 
ground, is the relatively small number of Negro 
respondents. Another possible weakness is that 
few respondents lived in urban ghettolike condi- 
tions. 


Factor-Analytic Procedure 


A major question about the results of any factor 
analysis is the stability of the factor structures 
that are identified. To examine this issue, two 
parallel analyses of Tesponses, rather than one, 
were done. This was accomplished. by first factor- 
ing the responses of the first and every other 
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respondent in the sample (the odd respondent, 
N = 2,551), and then factoring the responses of 
the second and every other respondent (the eyen 
respondents, = 2,551). Principal-componentg 
solutions were obtained by factoring a matrix of 
Pearson product-moment correlations with 1.00s 
in the diagonal. Varimax rotations were made of 
from 2 through 15 factors. The similarity of factor 
structures for each number of factors rotated was 
examined by computing Burt's unadjusted cor- 
relation coefficient (also called Tucker’s phi or 
the coefficient of congruence; Tucker, 1957) be- 
tween the two different solutions. A high degree 
of congruence was found for the solutions up to 
and including 10 factors rotated. When 11 or more 
factors were rotated, there was a marked drop in 
the maximum coefficient for some factors. The 
implication is that when 11 factors were rotated, 
factors that had less likelihood of being replicated 
began to arise. Thus, it was decided to use the 
factor structure identified by the rotation of 10 
factors. From the outset of this research, approxi- 
mately 10 factors had been hypothesized as repre- 
senting the constructs underlying the items in- 
cluded in the instrument. 


RzsurrS 


Definition of Achievement-Orientation 
Dimensions 

When the number of factors was decided 
upon, the next issue was to determine 
which combination of items, using unit- 
weighted summation scores, gave the max- 
imum reliability on a dimension. The first 
step was to compute the Spearman-Brown 
estimate of Cronbach’s alpha for the differ- 
ent combinations of items until that com- 
bination with the maximum reliability was 
identified for each dimension. All subjects 
were then scored and Cronbach’s alpha was 
computed for each dimension. Table 2 con- 


TABLE 1 
Cua: 
RACTERISTICS OF RESPONDENT SAMPLE eee 
Sex % | Rae |% Religious % Econ g, | Years of S % 
faust Ecc sal ead dim Fe mg Ru jet" 
Male |51| White | 86} Protestant 28], 11 |16|,9 ordess K 
Female |47| Negro 3| Jewish 8 II 14 | Some high school 35 
N.A. 2| Other 1| Roman Catholie | 21 III |20| Completed high school 0 
N.A. 10| Other 4 IV | 13] College 10 
None 5 V | 16} Postcollege 6 
N.A. 34 VI 10 | N.A. 
VII 5 
N.A. 6 


Note.—N = 5,102. Abbreviations: N.A. = not applicable. 
* Status levels are those defined by Hollingshead and Redlich (1958). 
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TABLE 2 


Nomper or ITEMS, Means, STANDARD DEVIATIONS, ÜRONBACH'S ALPHA RELIABILITY, AND TEST-RETEST 
RELIABILITY FOR 10 ÁCcHIEVEMENT-ORIENTATION FACTORS 


Number of Cronbach's ii 
Factor items* Ld SDP EM. Sabi 

A: Test anxiety 18 8.43 4.23 .84 80 
B: Threat of failure 14 4.14 3.07 bie 81 
C: Parental encouragement 13 7.75 2.94 "m 88 
D: Unwillingness to risk failing 1 4.14 2.79 T6 81 
E: Dislike of those who do better than 

onself 17 1.74 2.18 73 .82 
F: Concern about primary roles 9 6.17 2.50 79 .86 
G: Desire to excel 29 23.09 4.85 81 .81 
H: Sensitivity to others’ knowing of 

one's failure 17 7.23 4.22 84 78 
I: Exerting effort to do well 16 9.31 3.09 71 80. 
J: Valuation of competition 9 6.32 2.17 14 85 


* Seven items included in the measurement instrument are not scored. 


^N = 5,102 
‘N= 97. 


tains the number of items scored, the mean, 
the standard deviation, internal consistency 
reliability (Cronbach’s alpha), and the test- 
retest reliability coefficient for each of the 
10 dimensions. The test-retest. coefficients 
are based on a sample of 97 male and fe- 
male respondents, some from a four-year 
college and some from a 2-year community 
college, retested 3 weeks after the initial 
testing. 


Content of Dimensions 


The next step was to examine the content 
of items scored on each factor analytically 
defined dimension. No item is scored on 
more than one dimension. A brief deserip- 
tion of the content is given below. 

Factor A: Test anxiety. Items with high 
loadings on this factor are “I get upset be- 
fore exams” (keyed true); and “My stom- 
ach often gets upset before exams” (keyed 
true). A high score indicates a high degree 
of anxiety related to taking examinations. 

Factor B: Threat of failure. Items with 
high loadings on this factor are “Even when 
I do something well, my efforts are not 
recognized” (keyed true); and “Failure al- 
Ways seems to be at my heels" (keyed 
true). A high score indicates a high degree 
of feeling that one gains little recognition 


for doing something well and of feeling like 
a failure, 


Factor C: Parental expectations. Items 
with high loadings on this factor are “My 
parents encouraged me to be above average 
in whatever I undertook" (keyed true); and 
“My parents always encouraged me to do 
my best in everything” (keyed true). A 
high score indicates a high degree of pa- 
rental encouragement and parental ex- 
pectations that the respondent “be above 
average,” “do better than others,” or “be 
best” in whatever activities are undertaken. 

Factor D: Unwillingness to risk failing. 
Items with high loadings on this factor are 
“J would avoid a course in which I might 
do poorly, even if it were very interesting” 
(keyed true); and “I prefer easy Courses, 
even if they are dull” (keyed true). A high 
score indicates avoidance of activities in 
which one might do poorly, preference for 
“easy” activities, and satisfaction with 
doing average work. 

Fador E: Dislike of persons who do better 
than oneself. Items with high loadings on 
this factor are “I dislike people who do 
better than I do” (keyed true); and “I 
dislike people who are more popular than 
I am” (keyed true). The precentage of 
responses in the keyed direction for these 
and other items on this factor are low; 10 
of the 17 items have less than 10 76 respond- 
ing in the keyed direction. À high score 
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indicates a strong dislike for people who do 
better than the respondent. 

Factor F: Concern about failing in pri- 
mary roles. Items with high loadings on 
this factor are “There are times when I 
worry about being a successful husband 
(or wife)” (keyed true); and “There are 
times when I worry about being a success- 
ful parent” (keyed true). A high score indi- 
cates a high degree of concern about being 
successful as a husband or wife or parent 
and about being successful in a job or 
career, 

Factor G: Desire to excel. Items with high 
loadings on this factor are “I have a strong 
desire to be above average” (keyed true); 
and “I prefer friends who want to be above 
average” (keyed true). A high score indi- 
cates a desire to do well (or do above av- 
erage), and a low score appears to suggest 
a defensive quality of denying concerns 
about doing well. It is notable that the 
Mean score on this dimension is high, 
23.09, with an upper limit of 29.0 ; however, 
Scores range as low as 1.0. 

Factor H: Sensitivity to others? knowing of 
one’s failure. Ttems with high loadings on 
this factor are “Even if I were to fail in 
something important, I could still face my 
friends” (keyed false); and “Even if I 
failed in something important I could al- 
ways face my parents” (keyed false). A 
high score indicates a high degree of sensi- 
tivity to others’ knowing of one’s failure. 
There is also the implication that persons 
who have high scores would avoid their 
friends because of sensitivity to what the 
friends would think about their failing. 

Factor I: Exerting effort to do well. Items 
with high loadings on this factor are “I at- 
tempt to learn everything I can from a 
course” (keyed true); and “I exert extra 
effort to do a difficult course assignment 
well” (keyed true). A high score indicates 
making a strong effort to do well. 

Factor J: Valuation of competition. Items 
with high loadings on this factor are “I 
like competition” (keyed true); and “I pre- 
fer friends who are not competitive" (keyed 
false). A high score indicates a liking for 
competition. The orientation is “desiring to 
do better than others,” with particular ref- 
erence to competition. 
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Intercorrelation of Factor Scores 


The final step in the factor-analytie study 
was to intercorrelate scores on each achieve- 
ment-orientation dimension with scores on 
every other dimension. The results are 
given in Table 3. 


Part 2: A CORRELATIONAL STUDY OF 
RELATIONSHIPS BETWEEN 
AGCHIEVEMENT-ÜRIENTATION 
SCORES AND SELECTED 
CRITERION VARIABLES 


The purpose of the correlational study 
was to examine the validity of the con- 
structs underlying the factor analytically 
defined dimensions, Two questions are con- 
sidered. First, can the achievement-orienta- 
tion scores be explained by demographic 
characteristics, such as race or socioeco- 
nomic status, or by personal characteristics, 
such as age or birth order? Second, are the 
achievement-orientation scores associated 
with factors, such as academic performance 
or educational aspirations, that have often 
been considered to be influenced by achieve- 
ment motivation? 


Procedure 


Following the administration of the achieve- 
ment-orientation questionnaire, respondents were 
asked their sex, race, religious preference, the fre- 
quency with which they attend religious services, 
their age, birth order, academic level, their fa- 
ther’s occupation, and the level of education com- 
pleted by their father and mother. They were also 
asked to indicate their grade average and the grade 
level of education they would like to complete. 
Step-wise multiple-regression analyses were done 
to examine the degree to which differences in 
such personal or demographic charaoterient? 
were associated with differences in achievemen! 
orientation scores. Characteristics such a8 8eX, 
race, and religious preference, which are Wo 
of discrete eategories, were dummy-coded for ds 
analysis following a procedure described by Col us 
(1968) and others. Father's occupational sta 
was coded by the National Opinion Resear 
Center status ranking system developed by on 
and Hatt (described in Reiss, 1961). A secon i i 
of regression analyses was done to dé 
degree to which the achievement-orienta a 
scores were associated with an individual’s oe 
average or the level of education he desi ae 
complete, after personal and demographic ¢ 
acteristics were controlled. 


| 
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RESULTS 


The multiple-regression procedure pro- 
vided statistical evidence about the rela- 
tionships between variables included in the 
present study. However, after examining 
this evidence it was felt that some criterion 
for importance of a relationship was needed. 
Since the sample size was large, statistical 
significance seemed an inappropriate crite- 
rion. A correlation of only .03, involving no 
more than .09% variance, is statistically 
significant at the p = .05 level. Conse- 
quently, the decision was made to put the 
emphasis on the magnitude of the relation- 
ship. Only those associations accounting for 
1% or more variance will be discussed, al- 
though overall variance associated with a 
group of variables, that is, demographic- 
personal or achievement orientation, is 
given. 

The multiple correlations and proportion 
of variance accounted for on the achieve- 
ment-orientation dimensions by the differ- 
ent demographic and personal characteris- 
tics are presented in Table 4. The multiple 
correlations range from R = .17 to .25. In 
other words, the demographic and personal 
variables do not account for more than 
6.5% of the variance on any achievement- 
orientation dimension. 

The sex of the respondent is most fre- 
quently and strongly related to the achieve- 
ment-orientation dimensions. Being female 
is associated with higher scores on test 
anxiety, concern about primary roles, and 
exerting effort to do well. Being male is as- 
sociated with higher scores on parental en- 
couragement, dislike of those who do better 
than oneself, and valuation of competition. 
The respondent’s academic level is related 
to three achievement-orientation dimen- 
sions. High school students have higher 
Scores on threat of failure and sensitivity 
to others' learning of one's failure, while 
college students have higher scores on de- 
Sire to excel. 

Four additional variables are associated 
with achievement-orientation scores. Those 
who attend religious services more fre- 
quently have higher scores on both exerting 
effort to do well and valuation of competi- 
tion. The higher the father’s occupational 
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status the lower the scores on unwilli ess 
to risk failure and the higher the scores on 
desire to excel. Higher levels of paternal 
education are associated with higher scores 
on parental encouragement. Lower levels 
of maternal education are associated with 
higher scores on threat of failure. 


Relationship between Achievement-Orientation 
Scores and Two Criterion Variables 

The degree of association between 
achievement-orientation variables and each 
of the two criteria, grade average and as- 
pirations for education, was determined by 
“forcing?” the demographic and personal 
characteristics into the regression equation 
and then entering the achievement-orienta- 
tion variables. Table 5 presents these re- 
sults. 


Grade Average 

Demographic and personal characteristics 
account for 5.7% (R = .24) of the variance 
on grade average. Higher grade averages 
are associated with being female and having 
& father whose occupational status is higher. 
Achievement-orientation variables account 
for an additional 20.2% of the variance on 
grade average. The multiple correlation in- 
creases from R = .24 to R = .51. Higher 
grade averages are associated with lower 
scores on Factor B (threat of failure), with 
higher scores on Factor G (desire to excel), 
with lower scores on Factor D (unwilling 
ness to risk failure), and with lower scores 
on Factor J (valuation of competition). 


Level of Education Desired del 

Demographic and personal characteristics 
account for 17.2% (R = .41) of the vari- 
ance on level of education desired. Higher 
levels of education are desired by ren 
ents in higher grade levels, respondents w20 
are male, and respondents whose mother 
and father had higher levels of educi 
Achievement-orientation variables accoun 
for an additional 7.4% of the variance db 
level of education desired. The muliip* 
correlation increases from R = .41 to E à 
.50. More education is desired by those e 
have low scores on Factor D (unwillingn a 
to risk failure), and who have high scores 
Factor G (desire to excel). 


| 
| 
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TABLE 5 


CUMULATIVE MULTIPLE CORRELATIONS oF DEMOGRAPHIC-PERSONAL CHARACTERISTICS AND 
ACHIEVEMENT-ORIENTATION Dimensions wirH Two CRITERION VARIABLES 


Grade average Desired level of education 
Variable 
Cum R Variance % Cum R Variance % 
— 
Demographic-Personal 
Sex -157 2.5 .381 2.6 
Father's occupational Status +203 1.7 — — 
Father's educational level — — -399 1.4 
Mother’s educational level — — -345 5.1 
Academic level — = -261 6.8 
Cumulative R .240 — .414 — 
Accumulated Percentage of variance — 5.7 — 17.2 
Achievement Orientation 
B: Threat of failure .399 10.2 — = 
D: Unwillingness to risk failing 483 3.1 -467 44 
G: Desire to excel .450 4.8 492 2.8 
J: Valuation of competition .494 1.1 — 
Accumulated inerement in percentage of variance 
over Demographic-Personal — 20.2 — 7.4 
Total cumulative R (Demographic-Personal and 
Achievement Orientation) .509 — .496 = 
Total accumulated Percentage of variance (Demo- 
graphic-Personal and Achievement Orientation) — 25.9 — 24.6 


Note.—Multiple correlations given under Cum R are the values at the "step" when that variable 


entered the regression equation. Values given under Variance % 


Discusston 
The Factor-Analytic Study 


A second reason for designing the instru- 
ment was to have a measure with better 
psychometric Properties than the TAT 
possesses. As Table 2 indicates, internal- 
consistency reliabilities of the achievement- 
orientation dimensions provide an ade- 


quate basis for using the measure in experi- 
mental studies. The test-retest reliabilities, 
which range from .78 to .88, are high for 
attitude and personality test dimensions. 
Further study is under way to obtain 4 
clearer picture of the dimensions’ validity. 

Another reason for developing the meas- 
ure was to examine the hypothesis that 
"achievement motivation” consists of sev- 
eral constructs. The preceding results sup- 
Port this hypothesis, at least when it is 
tested on the present group of objective 
self-description type items. Overall, the 
item content of the different dimensions is 
clear as to meaning. Work is in progress to 
determine whether or not additional dimen- 
sions specific to one sex or the other or to 
high school or college students can be iden- 
tified. 5 

The intent of the factor-analytic study 
was to identify distinct psychological con- 
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structs which make up an individual's 
achievement orientation; However, even 
factor analytically defined, orthogonal di- 
mensions, when scored, may correlate with 
each other and these correlations are sug- 
gestive of relationships between the con- 
structs. Examination of the correlations in 
Table 3 point to several considerations. 

The largest correlation is between Factor 
A (test anxiety) and Factor B (threat of 
failure) (r = .41). Generally, the achieve- 
ment-motivation literature has viewed these 
two constructs as related, if not more or 
less identical. Here, they are positively re- 
lated, but not identical since only 16% of 
their variance is shared. Other correlations 
underscore the part social approval plays 
in certain aspects of achievement orienta- 
tion. Factor H (sensitivity to others’ know- 
ing of one’s failure) is related to Factor A 
(test anxiety) (r = .38), to Factor G (de- 
sire to excel) (r = .40), and to Factor F 
(concern about primary roles) (r = .37). 
Certain correlations tend to support previ- 
ous findings that a desire to succeed is 
associated with a willingness to risk failure 
while an unwillingness to risk failure is re- 
lated to threat of failure. These are the 
correlation between Factor D (unwilling- 
ness to risk failing) and Factor I (exerting 
effort to do well) (r = —.36), and the cor- 
relation between Factor D and B (threat of 
failure) (r = .26). 

An important issue for any factor-ana- 
lytic results is the possibility of replicating 
them. Two types of replication have already 
been done. One was a comparison of results 
from the pilot factor analysis with results 
from the factor analysis of the revised 
Measure. Eight of 10 factors were essen- 
tially the same, although 30% of the items 
on the pilot measure were replaced and the 
Tespondents on the revised measure were 
more diverse. The other replication data 
were the parallel factor analyses of re- 
Sponses on the revised measure. Only those 
factors which were clearly the same on the 
two analyses were rotated. In contrast to 
Many previous studies based on more lim- 
ited samples of respondents, the present 
Sample also provides a better basis upon 
Which to define dimensions representative 
of the achievement orientation of American 
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high school and college students. Of course, 
were one to depart to any notable degree 
from the present item content, the factor 
structure could change. Indeed, given the 
somewhat arbitrary manner in which the 
domain of item content must be defined, 
the scientific contribution of a study such 
as the present one rests not in its defining 
all possible constructs, but in the contribu- 
tion to our understanding that the defined 
constructs make. 


The Correlational Study 


One issue around which there has been 
considerable confusion is the influence of 
sex differences on achievement orientation. 
The respondent’s sex is the variable most 
consistently associated with the achieve- 
ment-orientation scores. Respondents’ sex 
accounts for 1% or more of the variance on 
six dimensions. Being male involves greater 
valuation of competitive activity which 
accords with the general conception of the 
American male role. Males also perceive 
themselves as receiving more parental en- 
couragement to achieve and do well. Pa- 
rental encouragement of male children 
more than female children also coincides 
with the general view that more attention 
is paid to the accomplishments of the 
American male child than to those of the 
female child. Women tend to express more 
concern to perform successfully their role 
as wife and mother, acknowledge a greater 
degree of test anxiety, and feel they exert 
greater effort to achieve. The emphasis on 
the roles of mother and wife coincides with 
the general role expectations for the Ameri- 
can woman. As for the difference-on test 
anxiety, men may be hesitant to acknowl- 
edge anxiety because they feel that it is 
"unmanly" to express such feelings; con- 
sequently, women score higher because they 
have less conflict about acknowledging 
anxiety. If there is an actual difference in 
levels of anxiety between men and women, 
the difference might be due to women’s 
receiving less parental support for their 
achievement strivings. As a result they have 
stronger feelings of insecurity in undertak- 
ing achievement tasks. The exertion of 
greater effort to do well may be women’s 
means of coping with their greater anxiety. 
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lf this line of reasoning were correct, one 
might find higher test anxiety in women 
who have brothers than in those who do 
not. The presence of brothers would under- 
Score for the female child the differential 
in parental support. As a result of this in- 
creased awareness, women with brothers 
might also be found to exert more effort to 
achieve in order to overcome their greater 
anxiety. Further study is needed to test 
these hypotheses. 

The association between the sex of the 
respondent and scores on the achievement 
orientation dimensions is also reflected in 
academic performance. As Table 5 indi- 
cates, being female is associated with higher 
grade averages. More detailed examination 
of the data points up that after the two 
most important achievement -orientation 
factors—threat of failure and desire to ex- 
cel—are statistically controlled, Factor I 
(exerting effort to do well) is positively 
correlated with grades, and Factor J (valu- 
ation of competition) is negatively corre- 
lated with grades. Thus, the male’s stronger 
competitive orientation appears to be detri- 
mental to academic performance while the 
female’s stronger orientation to exert effort 
is facilitative. This result is particularly 
interesting in view of the fact that Factors 
I and J themselves have a positive (zero- 
order) correlation, which changes when 
threat of failure and desire to excel are 
controlled. ' 

This latter result has particular signifi- 
cance for the assumption that a multidi- 
mensional measurement of achievement 
orientation provides a clearer understand- 
ing than earlier methods. For example, the 
McClelland et al. (1953) definition of 
achievement motivation, “competing with a 
standard of excellence,” Suggests combining 
content associated both with valuation of 
competition and with exerting effort to do 
well. The practice of combining the two 
types of content would be supported by 
the positive (zero-order) correlation be- 
tween the two factor analytically derived 
dimensions. However, such a practice is not 
supported by the finding that, after other 
achievement-orientation dimensions are ac- 
counted for, orientation to competition and 
orientation to exerting effort are associated 
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with grade average 
equal degree, but in opposite directions, 
Consequently, to have combined the two 
types of content would have resulted in 
failure to identify their differing effects. 
Table 5 indicates that the sex of the re- 
spondent accounts for essentially equal 
proportions of variance on grade average 
and on desired level of education. However, 
the associations are in opposite directions, 
Being female is associated with higher 
academic performance but lower aspira- 
tions for education. This is somewhat sur- 
prising since one would expect higher per- 
formance to be associated with higher as- 
pirations. Presumably, factors such as the 
women’s perception that their parents ex- 
pected less of them and limitations on the 
numbers and types of careers available to 
women with high levels of education tend to 
reduce women’s aspirations for education. 
Another issue to which increasing atten- 
tion has been paid is the relationship be- 
tween a respondent’s socioeconomic status 
and his achievement orientation. As has al- 
ready been indicated, the present results 
indicate a significant but relatively small 
association between indicators of socio- 
economie status and the achievement-ori- 
entation dimensions themselves. All demo- 
graphic and personal characteristics studied 
do not exceed a correlation of R = .2 
(6.576 of the variance) on any dimension. 
However, there is more to the picture. t 
An examination of the manner in which 
demographic and achievement orientation 
variables are associated with the two ori- 
terion variables provides a striking con- 
trast. Demographic variables account for 
5.7% (R = .24) of the variance on grade 
average; they account for 17.2% (R = 41) 
of the variance on desired level of educa- 
tion. Of the variance accounted for on 
grade average by the demographic varia- 
bles, sex of the respondent contributes the 
largest portion. Father's occupational status 
is the only other variable to account for 
more than 1% of the variance. Of the vari- 
ance accounted for on desired level of edu- 
cation the respondent’s academic level at- 
counts for more than one-third, the 
respondent’s mother’s level of hes 
accounts for slightly less than one-third, 


to an approximately 
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and the respondent’s father’s level of edu- 
cation accounts for a smaller but notable 
amount. The sex of the respondent ac- 
counts for most of the remainder, 

There is a much smaller association be- 
tween the father’s education level and the 
child’s aspirations for education than there 
is between the mother’s educational level 
and the child’s aspirations. It is generally 
held that if one parent is to have more 
education than another, it is most impor- 
tant for the head-of-household, usually the 
father, to have the most education in order 
to provide for the family. However, these 
results suggest that the mother’s education 
level is the stronger influence on her chil- 
dren’s aspirations for education. 

Achievement-orientation dimensions, as 
noted above, account for 20.2 % of the vari- 
ance on grade average, contrasting with 
the 5.7% of variance accounted for by 
the demographic-personal variables. While 
further research is needed, these results 
suggest that in order to understand aca- 
demie performance, an individual's achieve- 
ment orientation should be a central con- 
Sideration. Particularly important is the 
Impact of high scores on Factor B (threat 
of failure) which are associated with low- 
ered academic performance. A search for 
the reasons that a person feels or does not 
feel that failure is a threat would seem to 
offer the most promise for understanding 
how an individual develops his particular 
orientation to achievement. Questions about 
whether the individual’s reaction is a result 
of actual failure or is an intense concern 
about possible failure are important. Fur- 
thermore, students at lower academic ley- 
els, that is, high school students, have 
higher scores on Factor B than students at 
higher academic levels, that is, college stu- 
dents. This suggests the possibility that the 
threat of failure dimension may be impli- 
cated in selection for college. : 

. Two important issues are not examined 
m the present study. One is the nature 
of relationships between  TAT-assessed 
Achievement motivation and achievement 
orientation as defined by the factor analyt- 
ically defined dimensions. The other is the 
nature of the relationship between intel- 
ence and the achievement-orientation di- 


mensions. Examination of the former issue 
will make it possible to relate the present 
study more closely to previous work on 
achievement motivation. Examination of 
the latter issue may provide understanding 
of the degree to which an individual's abil- 
ity is a determinant of his achievement 
orientation. 
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IQ, SOCIOECONOMIC STATUS, AND SHORT-TERM MEMORY 


D. E. ORN au» J. P. DAS! 
University of Alberta, Edmonton, Canada 


High- and low-socioeconomie status (SES) was combined with av- 
erage- and low-IQ levels to select four samples of children. The SES 
levels were matched on IQ, whereas within each SES the two IQ sam- 
ples were matched on mental age. Each child was given visual and 
auditory short-term memory tasks and a visual to auditory cross- 
modal coding task. It was observed that whereas in the average IQ 
level, the high-SES sample did better than the low-SES sample in 
short-term memory, this was reversed in the low-IQ level. This cross- 
over effect was interpreted in terms of Jansen’s hypothesis regarding 
the distribution of associative and reasoning abilities in high- and 
low-SES subpopulations. The possibility of classifying high-SES re- 
tardates as primary and low-SES retardates as secondary was dis- 


cussed. 


Subnormal children from low socioeco- 
nomic status (SES) often appear to be 
brighter than those from middle and high 
SES. This has been empirically shown in sev- 
eral studies that compared them on associa- 
tive memory (Jensen, 1970). Generally, the 
middle SES group was found to be inferior 
In rote learning tasks such as serial learning 
and paired-associate learning. Jensen (1967) 
further noticed that IQ did not predict the 
associative learning ability of low SES sub- 
Jects nearly as well as it did of high SES 
Subjects. Standard IQ tests are predomi- 
nantly tests of reasoning and abstraction 
rather than of associative memory ability. 
Tt was therefore possible to assume that 
there wes a disparity in the associative 
ability of low and middle SES groups of 
equal IQ. 

There may be more than one explanation 
for this disparity. A seemingly obvious one 
Would be to assume that the high SES re- 
tardate was a true retardate, whereas the 
low SES one was merely a culturally de- 
Prived child, a victim of an unstimulating 
early environment. Given enough practice, 
ond 

‘The research was supported by the Alberta 

"man Resources Research Council. 

Requests for reprints should be sent to J. P. 

35, Center for the Study of Mental Retardation, 
Caen of Alberta, Edmonton, Alberta, 


the latter comes out superior to the former. 
A study on rote learning carried over three 
days showed that the low SES subnormal 
subjects did increasingly better than their 
high SES counterparts from the first 
through the third day (Rapier, 1966). 

Jensen (1970) suggests that most high 
SES retardates are deficient both in a850- 
ciative learning (Level I) and in reasoning 
(Level II), whereas most low SES retard- 
ates are deficient only in Level II. The 
digit span and short serial-learning tests 
measure Level I ability, and test scores on 
these are not correlated with IQ in the low 
SES IQ samples, although they do so in 
high SES samples of both low and average 
IQ. He further observes that good Level I 
ability is essential for Level II, but beyond 
this threshold, Level I is not a predictor of 
Level II. Since low SES is characterized by 
lower performance in Level II tasks, low 
SES subjects come out to be lower in IQ 
but should be equal in Level I tasks. Do 
these assumptions explain why the low SES 
retardate should be better than the high 
SES retardate in associative learning? 

The SES X IQ interactions should be 
found in retardates. The low SES retardate 
should be inferior to the high in Level II 
because of known IQ) differences between 
the two groups. Since both are below the 
average threshold in IQ, Levels I and II 
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will be positively related to some extent. 
Now, if one randomly selects a group of 
high SES and a group of low SES retard- 
ates, the latter will be lower in measured 
IQ (but equal at Level I) because of SES. 
If one further wishes to match the low 
SES subjects on IQ with those subjects 
that are high, one would be compelled to 
select brighter, low SES individuals who 
will be equal in Level II with the high 
SES subjeets, and will also be superior to 
them in Level I due to the process of selec- 
tion. This superiority in Level I would be 
reflected in their performance on digit span 
and serial learning tasks. 

The task variable could be manipulated 
on a Level I-Level II continuum in order 
to highlight the nature of the difference 
between the performance of high and low 
SES groups. If the tasks require transfor- 
mations and integration of complex infor- 
mation, they will reflect IQ differences, 
whereas if the tasks require a short-term 
recall of stimulus items, whose number is 
within the subject's span of apprehension, 
IQ will not be a relevant determinant of 
performance. There would be some merit in 
varying the tasks on these aspects while 
essentially preserving their short-term mem- 
ory characteristics. It will enable one to 
test the generality of Jensen's (1970) cross- 
over effect; the low SES being superior to 
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high in associative ability in subnormal IQ 
range. 

The tasks in the present study had the 
face validity of short-term memory: a vis- 
ual-digit matrix task, an auditory-input- 
visual-recognition task, and four-word mem- 
ory lists having  acoustically similar, 
semantically similar, or unrelated words. 

Both IQ and SES levels were dichoto- 
mized in the study. The two IQ groups 
(normal and low) were matched on mental 
age, while within each IQ group, the two 
SES groups were matched on IQ. Such a 
design permitted us to examine the per- 
formance of subject groups due to IQ X 
SES interactions. 


METHOD 


Subjects 


Four groups of children, 30 in each group, were 
selected from an initially surveyed population of 
1,300 children. All children, both normal and Te- 
tarded, had IQs of below 100 as found in their 
school records. Their mean IQ, SES, and mental 
age scores are shown in Table 1. The normal and 
retardate groups were matched on mental age for 
each SES level, whereas the high and low SES 
groups were matched on IQ. However, because of 
the superiority of the high SES to the low SES in 
IQ inherent in the population, a less than perfect 
matching was possible. Although IQs of the two 
SES groups were not statistically different 
(Groups 1 versus 2, and 3 versus 4), the mental 
ages were unfavorably loaded against the low 


TABLE 1 
Summary DATA FOR THE Four EXPERIMENTAL GROUPS 


perpa G 2 Group 3 Group 4 
an: iB, oe oo ars) 
to 
M SD P SD M SD M SD 
IQ 93.93 3.95 | 90.20 | 4.22 68.63 7.98 65.53 | 7.09 
Mental age 93.73* | 6.41 | 86.00 | 6.77 | 93.57» | 13.37 | 85.93 | 14.50 
Chronological age 99.70 5.51 | 95.40 | 6.66 | 136.30 13.47 | 130.07 | 15.88 
Average socioeco- 
nomic status 
(Blishen) 53.7 41.0 53.0 41.3 


 Note—Both mental age and chronological age are in months. 
a t between mental age Groups 1 and 2 = 4.32, p < .001. 
^ t between mental age Groups 3 and 4 = 2.32, p < .05. 
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SES groups. This, if anything, will stack the cards 
against finding support for the crossover effect. 

In the present design, our object was to equate 
the SES groups on IQ and the IQ groups on mental 
age. Examination of Table 1 will reveal that this 
was achieved. Socioeconomie scores were obtained 
through interviewing parents and rating them on 
Blishen's (1961) occupational scale. The scale, 
constructed from Canadian census data, provided 
a ranking for occupations based on income and 
years of schooling. In our larger population, it 
showed a high correlation with the more elaborate 
Warner Scale (1960) scores (r = .80). 


Tasks 


Auditory short-term memory. This task was 
adapted from Baddeley (1964, 1966) who had com- 
pared the recall of acoustically and semantically 
similar words. He observed that acoustic simi- 
larity interferes with short-term memory whereas 
semantic similarity interferes with long-term 
memory. In the current study acoustic and se- 
mantic interference was examined in short-term 
memory to find out if they interact with IQ and 
SES. 

Twelve four-word lists were prepared from a 
pool of acoustically similar words. Also, 12 seman- 


tically similar four-word lists and 12 unrelated 
control lists were prepared. The words in each 
category were as follows. 
acoustic: mad, man, mat, cap, cat, can, cab, 
pan, tap 
semantic: big, long, great, tall, large, wide, high, 
fat, huge 
control: RUM bar, pen, few, hot, key, wall, 
00! 


Subjects were arbitrarily divided in each IQ— 
SES group so that half received the acoustic lists 
with interspersed control ones and half received 
the semantie lists and the control lists. All lists 
were read into magnetic tape and presented by 
tape recorder at the rate of one word per second 
with an interlist recall time of 15 seconds. Sub- 
ject’s oral recall of a list was recorded on tape for 
later scoring. A rest period of 1 minute was allowed 
after every twelfth list. 

Before the lists were presented, subjects were 
read the 27 words separately and were asked to 
define the words as they were being read in order 
to insure that each subject knew the words. 

Visual short-term memory. The visual task con- 
sisted of separate presentations of 20 five-digit 
grids as illustrated in Figure 1. It was a typical 
short-term memory task in which a grid was 
presented for the subject’s viewing (5 seconds), 
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followed by a neutral filler task of color naming 
to prevent rehearsal (2 seconds) and then requir- 
ing the subject to recall the digits on an empty 
grid. The stimulus materials were devised by E. 
Howarth and J. Brown of the University of Al- 
berta, and have been used in a personality battery. 
Presentation of the stimulus slides was controlled 
by a Kodak Carousel 850 projector and a series of 
interval timers. Time sequences for ready signal, 
stimulus, and the filler task are shown in Figure 1. 

Twenty-two stimulus grids were presented; the 
first two served as practice trials. 

Cross-modal coding. The task was a modified 
form of one used by Birch and Belmont (1964). 
After the subjects heard patterns of sound, they 
were asked to recognize visually which of the three 
dot patterns resembled the auditory stimuli. 
Birch and his associates found this test to dis- 
criminate between good and poor reading ability 
and between good and poor nutritional status of 
children, and did not rule out the possibility that 
the two variables were coexistent (Cravioto, 
Gaona, & Birch, 1967). 

In its modified form, the sound patterns were 
1,000-cycles-per-second pure tones of a .15-second 
duration. Patterns were created by presenting 
adjacent tones with either short or long pauses 
between them. A short pause separated the tones 
by .35 seconds, and a long pause by 1.35 seconds. 
The 10 test-tone series, as shown in Figure 2, was 
preceded by three practice trials. The test series 
was replicated three times. All the sound patterns, 
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prerecorded on magnetic tape, were presented in- 
dividually to the subject, who had to recognize 
them visually out of three dot patterns presented 
on a white index card. The position of the correct 
pattern was randomly varied on these cards for 
each replication. The task can be viewed as a 
short-term memory test with auditory input and 
visual output modes. Obviously, however, the in- 
put would require appropriate encoding to trans- 
form the information for visual recognition. 

The sequence in which the tasks were given was 
varied in a predetermined random order, and sub- 
jects were assigned to one order arbitrarily. Test- 
ing was done at the subject's school in a quiet 
room. 

Scoring 

Auditory short-term memory was scored for 
free recall and serial recall. Although subjects 
were instructed to recall serially, a correct recall 
of the item in the list, irrespective of position, 
earned a point and their total was the free-recall 
score for the subject. 

In visual short-term memory, the number of 
digits in the correct position recalled by a subject 
in a grid was summed over the 20 grids and pro- 
vided the total score. Cross-modal coding was 
scored only in one way also; the total number of 
correct recognitions. Separate totals were main- 
tained for the three replications of the test series. 


Fra. 2. Cross-modal coding tone (stimulus) and dot (visual recognition) series. (The underlines Were 
omitted from test cards when presented to the subjects.) 
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Fic. 3. Interaction between IQ and SES: 
Serial recall. 


RESULTS 


Auditory Short-Term Memory: SES, and 
IQ Differences 


Serial- and free-recall scores were analyzed 
separately. The predicted IQ X SES inter- 
action was found to be significant only for 
Serial recall (Figure 3). A comprehensive 
analysis of variance of the serial-recall 
Scores included IQ (2), SES (2), and stimu- 
lus interference (acoustic/semantic) as in- 
dependent measures, and control/experi- 
mental lists as a repeated measure. 

Significant Fs (df = 1/112, p < .01) 
Were obtained for IQ, interference, and for 
IQ X SES. The first indicated that in spite 
of mental age matching, the normal group 
was superior to the retardates on overall 
Serial recall. The second implied that acous- 
tic similarity was more confusing than 
Semantic, as expected. Within-subjects Fs 
(df — 1/112) were significant for control 
Versus experimental lists (p < .01) be- 
Cause the experimental lists with interword 
Similarity were more difficult to recall; IQ 
X Lists (p < .01) interaction indicated 
that the difficulty in recalling experimental 
lists was greater for the retardates than for 
the normals, Second-order and third-order 
Interactions were also significant: IQ X 
Thterference X Lists (p < .05) and IQ X 
SES X Interference X Lists (p < .01). 
The first one was easily interpreted by 
examining the means that showed the re- 


tardates were more affected by the acous- 
tically similar lists than their normal 
counterparts, but such a difference was not 
noticed in the recall of semantically similar 
lists in the control condition. The third- 
order interaction can be traced to the poor 
recall of semantically similar lists by the 
high SES retarded subjects. Although this 
group’s overall short-term memory was not 
poor, the subjects seemed to be processing 
information quite differently from the sub- 
jects in the other three groups. Analysis of 
free-recall scores was not as informative, 
essentially because of the less conservative 
criterion for scoring. As before, the normal 
IQ group had higher recall scores than the 
retardates, and the retardate’s recall of the 
experimental lists was significantly de- 
pressed when compared to the normals. 
The IQ X SES F was insignificant and had 
a very small value (.99). 


Visual Short-Term Memory 

The 2 (IQ) X 2 (SES) analysis of vari- 
ance clearly showed the superiority of the 
low SES retardates over the high SES re- 
tardates. IQ X SES interaction was sig- 
nificant (F = 6.14, df = 1/116, p < .02), 
and the normals did better than the re- 
tardates (F = 26.42, p < .01). As in the 
auditory short-term memory tasks, this 
was observed even though the IQ groups 
were matched for mental age. The SES 
main effect was not significant (F < 1). 
Means and crossover effects can be seen in 


Figure 4. 


Low SES 
High SES — — — 


Number of Digits Recalled 


Low IQ 


AveragelQ. 


Fic. 4. Interaction between IQ and SES: 
Short-term visual memory. 
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Cross-Modal Coding 


Although the mean number of correct 
recognitions showed the same IQ X SES 
interaetion trend as in the two previous 
tests, it was far from approaching statistical 
significance. A 2 (IQ) X 2 (SES) analysis 
of variance revealed a strong main effect 
for IQ (F = 52.70, df = 1/116, p < .01). 
Thus, the test was similar to the two previ- 
ous ones in detecting cognitive competence 
of the IQ groups, but stood apart from 
them as a measure of rote learning ability. 


Test Reliability 


All short-term memory auditory scores 
were found to have high reliability (odd- 
even). The highest reliability of .90 was ob- 
tained for serial recall of semantic lists, .82 
for free recall of acoustic list, and last, .79 
for serial recall of acoustic list. The figures 
were based on all subjects. 

Visual short-term memory had a split- 
half reliability of .80. The reliability of 
cross-modal coding was .72, which was ob- 
tained by averaging the intercorrelations of 
the three replications of the test series. 

Thus, on the whole, the test scores show 
reasonably high reliability on samples of 
children with an IQ below 100. 


Discussion 


The graphs and the analyses for auditory 
serial recall and visual short-term memory 
clearly show that the low SES retardates 
had significantly better scores than high 
SES retardates. The other test scores con- 
firm this trend, but the effect does not ap- 
proach an acceptable level of statistical 
significance. Jensen’s theory concerning the 
IQ X SES interaction is thus supported by 
our results, 

Unlike Jensen’s interracial samples that 
were substantially separated in IQ and 
SES, all our samples came from the Cau- 
casian subpopulation. Also, the low SES 
children did not come from extremely poor 
slum areas, They were in the same neigh- 
borhood school as the high SES children. 
In IQs, both groups were below 100 and 
the ranges were quite close, but did not 
overlap. Similarly, the high and low SES 
had, in fact, adjacent occupational scale 
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positions, In spite of the closeness of the 
samples in IQ and SES, the crossover effect 
was obtained. 

The low SES subjects were slightly lower 
in both IQ and mental age than the high 
SES subjects, both in the normal as well as 
in the retardate groups. Because of this 
slight advantage of the high SES normals, 
their performance in all tasks is expected 
to be somewhat better than the low SES 
normals, which is generally found in the 
graphs. But the “unexpected” finding is 
that in spite of this advantage, the high 
SES in the retardate group performs at an 
inferior level in some of the tasks. The re- 
sults of the present study provide a con- 
servative test for Jensen's hypothesis, and 
extend its generality beyond the tests he 
had used, 

There is more than one reason why some 
test results did not show the crossover 
effect. A general explanation in terms of 
regression to the mean may be advanced. 
Mean scores of low SES retardates will not 
show as much upward regression as those of 
high SES because they are closer to their 
SES group mean. Thus, any regression 
effect will decrease the difference between 
the two SES groups and work against the 
chances of finding a significant crossover 
effect. , 

'The other reasons may be found in the 
nature of information processing req 
by the task and the habitual manner of 
processing such information peculiar to e8 
group. Cross-modal coding is a task which 
illustrates both. We obtained an IQ differ- 
ence, but not a IQ X SES interaction 1D 
this task. The processes in cross-modal cod- 
ing not only include translation from on? 
mode to the other, but storage and inte 
tion of the input information. Perhaps ! 
does not require an iconic transformation 
only as most other short-term memory 
tasks do. From the subject/s point of view, 
it may be approached by some as any other 
reasoning task, requiring symbolic trans- 
formations, whereas others may use & 00M 
bination of memory and reasoning. a 
not within the scope of this paper to be 
cuss these in detail, but a somewhat ela 
rate discussion is presented in another T 
port (Das & Chambers, 1970). 
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Regarding the classification of our two 
retardate groups as instances of primary 
and secondary mental defectives, Jensen 
(1970) has suggested that those deficient 
in both associative and reasoning abilities 
(Levels I and II) may be called primary 
retardates, whereas those deficient only in 
Level II be called secondary retardates, or 
better, nonretarded individuals who can be 
trained to exploit their Level I ability to 
cope with ordinary situations in daily living 
outside the school, This classification in 
terms of cognitive functioning is more use- 
ful for behavioral science than the conven- 
tional organic versus cultural-familial cate- 
gories. Our low SES retardates were 
certainly superior to high SES in Level I 
tasks. It would have been informative if 
thorough neurologieal examinations had 
been done on both groups to establish the 
number of so-called organic children in each. 
group. In terms of cognitive functioning, 
the high SES retardate was deficient in 
Level I, and seemed to conform to a pri- 
mary category, whereas the low SES retard- 
ate appeared to be “culturally deprived.” 
But where are the primary retardates of 
low SES? Since the children in our samples 
were in the public school system, it may be 
surmised that the primary retardates 
among low SES children usually cannot 
make it to Grade 1 of public schools, and 
are to be found in institutions or in private 

mes; 
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A questionnaire covering student attitudes, beliefs and behaviors 
in the areas of scholarship, prestige, and popularity, peer influences 
on behavior, and personal goals was administered to 1,225 New Zea- 
land secondary school students in two single-sex schools and one co- 


educational school. Schools were 


similar in curricula, student regi- 


mentation, and attitudes and values of teachers, administrators, and 
students’ parents. Significant differences were found between students 
in the single-sex schools and students of the same sex in the coeduca- 


tional school in all of the above 
may be inimical to both academic 


Many psychologists have viewed the 
early and preadolescent period as a defini- 
tive point in intellectual development. 
During this stage, intelligence, as a truly 
coordinated mental organization involving 
sensory-motor, cognitive, and conceptual 
abilities, can be said to appear. (Piaget, 
1952; Vinacke, 1951) Ausubel (1962) has 
noted the transition during early adoles- 
cence from “a predominantly concrete to a 
predominantly abstract mode of under- 
standing and manipulating complex re- 
lational propositions [p. 268]." And Braham 
(1965) has stated that “the prior years may 
be considered as the period of intellectual 
birth, with adolescence as the period of true 
intellectual growth [p. 251].” 

However, as Braham has also noted, 


The continued development of intelligence 
through adolescence requires an intellectually 
stimulating environment. It also requires interest 
in, and motivation for, intellectual activities on 
the part of the adolescent. . . . The problem, how- 
ever, is that when intelligence is just beginning to 
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areas. Results suggest coeducation 
achievement and social adjustment. 


function as a coordinated structure, and requires 
the most nurturing circumstances, the adolescent 
meets a major deterrent to its development, that 
of the intellectually negating adolescent peer- 
group structure, or sub-culture [p. 252]. 


This opinion, that the “adolescent society” 
in our secondary schools exerts a stultifying 
effect on intellectual activities, has been 
supported by the results of Coleman's 
(1961a) study of the adolescent society in 10 
American high schools. GM 

Coleman, pointing out that status in this 
adolescent society is dependent upon popu- 
larity, rather than upon scholastic or intel- 
lectual achievement, has suggested that 
these adolescent values stem, at least in 
part, from the coeducational organization 
of our schools, with a consequent emphasis 
upon “rating and dating.” He observes that, 
although educators and laymen have com- 
monly assumed that it is “better” for boys 
and girls to be in school together Mer 
adolescences, coeducation “may be inimi ‘ial 
to both academic achievement and 50 

justment [p. 51]." 
5^ Ven is a one of our fondest 
beliefs is in need of reexamination. But 9° 
assessment of the effects of coeducation. 0 
adolescent values is complicated by D r 
factors: First, there is the difficulty 
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separating the effects of coeducation from 
the effects of other aspects of our society. 
Second, there is the lack of single-sex schools, 
comparable to our coeducational high schools, 
whose students could be used in making 
comparisons. The New Zealand public 
secondary schools, however, provide an 
opportunity for evaluating some of the 
views on the effects of coeducation upon 
adolescent values. Although the high schools 
have been traditionally single-sex, in recent 
years a number of coeducational high schools 
have been established. Some of these have 
been patterned on the American compre- 
hensive high school and are regarded in New 
Zealand by the general public as being 
innovative, if not radical departures from 
educational tradition. Other coeducational 
schools, however, although admitting both 
sexes, follow the same curricular plan as the 
single-sex schools and the students, their 
parents, and the instructional and adminis- 
trative personnel appear to hold essentially 
the same values and attitudes. In its broader 
outlines, the New Zealand educational 
system is quite similar to that of the United 
States, as are societal attitudes toward the 
role and importance of education. For these 
reasons, this study made use of New Zealand 
secondary school students to assess the 
effects of coeducation on student attitudes 
and behaviors related to academic moti- 
vation and achievement. 


MxzrHOD 


Subjects 


Subjects were 1,255 students in their third and 
fourth years of secondary school in Wellington, 
New Zealand, the capital city with a population 
of approximately 200,000. Of these students, 697 
were males, 455 enrolled in an all-boys’ school and 

in a coeducational school. The total of 528 
females was divided between 364 in an all-girls’ 
school and 164 in the same coeducational second- 
ary school. 

The three schools were similar with respect to 
curriculum organization, degree of student reg- 
imentation, the requirement that all students wear 
School uniforms, and instructional methods. In 
Wellington, a small proportion of students elect 
to attend schools outside their assigned district. 
However, the large majority attend a particular 
School because it is in their zone. The girls’ school 
in this study, for example, draws less than 10% of 
its students from outside its zone. 
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This pattern of attending the school in one’s 
district is probably accentuated by the fact that 
instructional programs, whether in single-sex or 
coeducational schools, are virtually indistinguish- 
able from one another. All programs are tra- 
ditional university preparatory curricula and all 
are based on syllabi determined by the demands 
placed upon the schools by the national public 
examinations for graduation and university en- 
trance. 

While it was not possible to assign students 
randomly to single-sex or coeducational schools, 
the above factors, taken in conjunction with the 
generally egalitarian nature of New Zealand so- 
ciety, suggest that the three groups probably did 
not differ markedly from, one another in 
background or motivation upon entry into sec- 
ondary school. 


Procedure 


Items from the questionnaire used by Coleman 
(1961a) were selected to assess student attitudes 
and beliefs in the following areas: scholastic ac- 
tivities and attitudes, and popularity, peer influ- 
ences on behavior, and self-regard. These items 
were combined into a single questionnaire and 
administered on a group basis in each school with 
the explanation to student subjects that a study 
was being made to compare the attitudes of New 
Zealand and American secondary school students. 
All questionnaires were answered anonymously. 


Rxsuurs 


The responses of boys in the all-boys' 
school were compared with boys in the 
co-ed school; co-ed girls’ responses were 
compared with those of girls in the all-girls' 
school. Chi-square analysis of responses 
indicated significant differences between 
responses of students in the co-ed school and 
those in the single-sex schools in each of the 


following categories. 
Scholastic Activities and Attitudes 

When asked to indicate the number of 
hours spent on homework, significant, dif- 
ferences were found with co-ed students, 
both boys and girls, reporting that they 
spent less time than students in the single- 
sex schools. 

Thirty-eight percent of the co-ed boys 
reported spending, on the average, from 
114 to more than 3 hours per day on home- 
work. Over 55% of the boys attending the 
all-boys’ school reported averaging this 
much time. While differences were not as 
great between the two groups of girls, they 
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TABLE 1 
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Irem 7: How much time, on the average, do you spend doing homework outside school? 


None or almost none 10 4. 
Less than 1$ hour a day 11 4 
About 14 hour per day 31 12 
About 1 hour per day 53 22 
About 144 hours per day 43 17. 
About 2 hours 55 | 22 
3 or more hours 38 15. 


x? = 36.48, p < .01. 


1 15 3.3 14 8.5 11 3.0 
6 10 2.2 9 5.5 19 5.2 
9 16 3.5 15 9.1 21 5.8 
0 76 16.6 22 13.3 58 | 15.9 

8 86 19.0 33 20.0 55 | 15.1 
8 167 36.9 49 29.7 145 | 39.8 

8 84 18.5 23 18.9 55 | 151 


x! = 14.43, p < .05. 


were significant. Fifty-five percent of the many (24.4% versus 9.5%) would spend the 
girls at the all-girls’ school spent 134 to 3 or time at something other than the choices 


more hours per day on homework, while 


44.6% of the co-ed girls reported spending 


offered. 


this much time. between the two groups of boys in the 
TABLE 2 
Iram 8: Suppose you had an extra hour of school, how would you use it? 
Co-ed boys Boys 
Time spent 
N % N % 
Course 37 15.2 65 14.4 
Sport 94 | 38.7 | 136 | 30.2 
Club activity 30 12.3 61 13.5 
Study 39 16.0 121 26.8 
Something else 43 17.7 68 15.1 


x! = 30.81, p < .01. 


When asked how they would spend a free 
hour in school if given a free choice, co-ed 
boys differed significantly (p < .05) from 
boys attending the all-boys’ school. More 
co-ed boys indicated they would spend the 
time on sports and fewer would spend it in 
Study. The differences between co-ed girls 
and girls attending the all-girls’ school were 
even greater (p < .01) and followed a some- 
what different pattern. Nearly 32% of those 
attending the girls’ school would spend the 
time studying as opposed to 12.2% of the 
co-ed girls who would spend the time this 
way. More of the co-ed girls would spend 
the additional time in a club or activity 
(18.9% versus 13.6%) and over twice as 


percentages who reported they had been 
truant during the past year. There were 
however, significant differences (p < .0) 
between the co-ed girls and girls attending 


TABLE 3 t 
Trem 9. In the last year have you been truan 


from school? 
————M— M 


There were no significant differences 
, 
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TABLE 4 
Irem 11. The years at high school have been: 
Co-ed boys Boys Co-ed girls Girls 
Response 
N % N % N % N % 
Fun and exciting 24 9.9 50 11.1 29 17.7 25 6.8 
Interesting and hard work 73 30.2 164 36.4 | 46 28.0 172 47.0 
Fairly Pleasant 104 | 43.0 | 179 | 39.8 | 68 | 41.5 | 141 | 38.5 
Fairly Dull 39 | 16.1 48 | 10.7 | 14 8.5 16 4.4 
Unhappy 2 0.8 9 2.0 7 4.8 12 3.8 


x? = 1248, p < .05. 


x! = 42.06, p < .01. 


the all-girls’ school. The same percentage of 
coed girls as co-ed boys (63.0%) reported 
that they had been truant, while less than 
40% of the girls attending the all-girls’ 
school indicated they had been absent 
without authorization during the past year. 
In their overall appraisal of their years in 
high school, boys at the all-boys’ school 
gave more (p < .05) favorable ratings. 
Differences were even greater between co-ed 
girls and girls attending the girls’ school 
(p < .01). Again, a greater percentage of 
girls’ school students said their school year 
had been interesting and hard work, and 
fewer of them rated their school years as dull 
or unhappy. However, unlike the co-ed 
boys, significantly more co-ed girls described 
their time in school as “full of fun and 
excitement (p < .01). 


Prestige and Popularity 


Students at both co-ed and single-sex 
schools rated being a leader in activities and 
being in the leading crowd as most important 
in achieving prestige and coming from the 
tight family and having a nice car as being 
relatively unimportant. However, co-ed 
students, both boys and girls, ranked mem- 
bership in the leading crowd significantly 

er and scholarship significantly lower 
than did boys and girls at single-sex schools. 
Boys attending the all-boys’ school ranked 
Sports as a means of achieving prestige 
significantly higher than did co-ed boys. 

When asked how they would best like to 
be remembered at their school, as brilliant 
Students, leaders in activities, or as the 


most popular, co-ed boys distributed their 
choices rather uniformly across the three 
categories while a majority of the boys at 
the all-boys’ school chose being remembered 
as a brilliant student or a leader in activities, 
There were, however, no significant dif- 
ferences between the two groups of boys. 
The girls, on the other hand, did differ 
significantly (p < .01) with 41 % of the girls 
at the all-girls’ school preferring to be re- 
membered as a brilliant student as compared 
with 25.9% of the co-ed girls making this 
choice, and 34.2% of the co-ed girls wished 
to be remembered as “most popular” while 
19.4% of the students at the all-girls’ school 
wished to be remembered for this reason. 


TABLE 5 
Trem 1. What does it take to get to be important 
and looked up to by the other students at school? 


Response 


pel acai aie me a 
Coming from right 
family 


Leader in activities 
Having & nice car 


A good scholar 
3.09 |2.60** 


Being a sports star 


Being in leading crowd! 2.27 |2.95* | 2.54 |3.15** 


* < 0. 
++p < 0l. 
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TABLE 6 


Irem 6. If you could be remembered here at school for one of the three things listed below, 
which one would you want it to be? 


Brilliant student 
Leader in activities 
Most popular 


J. C. JONES, J. SHALLCRASS, AND C. C. DENNIS 


x! = 16.83, p < 0l. 


Peer Influences 


A majority of the total group agreed that 
membership in the leading crowd sometimes 
required that one compromise his principles, 
and there were no significant differences 


TABLE 7 
Trem 10. If you want to be part of the leading 
crowd around here, you have to sometimes 
go against your principles. 


g 151 |04.0| 296/66.0| 79 49.0| 181/51.0 
Disagree | 85 [30.0] 152/34.0| 82 |51.0| 17549.0 


x? = .14, ns. 


between students, boys or girls, attending 
the single-sex schools and those attending 
co-ed schools, 

. The disapproval of a friend had more 
impact than the disapproval of parents or 
teachers for all groups. But, while there was 


no difference between co-ed boys and boys 
attending the all-boys’ school, significantly 
more (p < .05) co-ed girls rated the loss of 
friendship as more difficult to take than 
parental disapproval. 

Although “having a good time” ranked 
high among the things that were important - 
to all four groups, it was ranked significantly 
higher by co-ed girls than by girls at the 
all-girls’ school (p < .01). Groups and 
activities outside the school were more 
important to boys and girls attending the 
single-sex schools than they were to co-ed 
boys and girls. Maintaining a good repu- 
tation was valued more highly by co-ed 
boys than by boys at the all-boys’ school. 

When asked to indicate the relative 
importance of the things they strove for in 
school, more students ranked "learning 88 
much as possible" in first place than any of 
the alternatives. However, significantly 
more (p < .05) girls from the all-gizls 
School ranked this as a first choice than 
co-ed girls. They also ranked “pleasing my 
parents” higher (p < .05) and “being ac 
cepted and liked by other students” lower 


TABLE 8 


Irem 12. Which of the following would be hardest for you to take? 


Co-ed boys 
Response 
N % 
po. be eae tot dd 

Parent’s disapproval 93 EU 
Teachers' disapproval 6 As 
Breaking with a friend 138 | 58.5 

*? 


z 2.85, ns. 


Boys Co-ed girls. 
5 
N % N % 7, an 
im | 43.5 | 55 | 35.3 | M9 | 84 
5 1:6 | 7) 4.5 5 [52 
240 | 55.0 | 94 | 60.3 | 195 à 


x? = 6.53, p < 05. 
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TABLE 9 


Irem 4. Rank the following in terms of their 
importance to you. 


Outside groups and 


activities 2.62 |2.53* | 2.76 |2.50** 


Activities associated 
with school 3.35 |3.14**| 3.32 |3.27 
1.68 |1.92 


2.18 |2.40* | 2.08 |1.91* 


Having a good time 1.90 |2.30** 


A good reputation 


*p«.0. 
**» « Ol. 


(p « .01). The two groups of boys differed 
only on this last category, co-ed boys ranking 
acceptance by other students higher (p < 
.05) than did boys from the all-boys’ school. 


Self-Regard 


Adolescence has usually been considered 
as a period when young people are frequently 
less than satisfied with themselves and the 
results seem to bear this out. Less than half 
of the boys and even fewer of the girls 
indicated that there was very little about 
themselves they would like to change. And in 
this respect, there were no significant dif- 
ferences between students attending co-ed 
or single-sex schools. 

A less direct question asked students to 
choose the category they felt best described 
a person who was alone. While no differences 
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TABLE 10 
Irem 3. Among the things you strive for, just how 
important is each? 


Pleasing my parents | 2.35 |2.25 | 2.40 |2.19* 
Learning as much as| 

possible 1.90 [1.82 | 2.06 |1.85* 
Living up to my reli- 

gious i 3.07 |3.00 | 3.54 |3.02 


Being accepted and 


liked by the students] 2.06 |2.25* | 1.97 |2.33** 


*p < 05. 
** p « 0l. 


were found between the two groups of boys, 
6295 of the co-ed girls indicated that they 
thought such a person was bored, unhappy, 
lonely, or afraid. A greater percentage of 
girls attending the all-girls’ school described 
such a person as better off, relaxed, thinking, 
reading, or happy. 


Discussion 


Various explanations for these results are 
possible. Students electing to attend co- 
educational schools, as opposed to single-sex 
schools, could conceivably differ in ways 
that are reflected in academic achievement 
and attitudes. This possibility, however, for 
reasons mentioned in the discussion of the 
selection of subjects, seems rather remote. 
Tt is also possible that significant differences 
exist between teachers in single-sex schools 


TABLE 11 


Response 


Don’t like the way I am; would 
change completely 

Many things I'd like to change; but 

bonot completely 
ike to stay much the same; little I 
would change 


Irem 2. Which category comes closest to your feeling about yourself? 


x? = 1.19, ns. 
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TABLE 12 
IrEM 5. A person who is alone is: 


Boys Co-ed girls Girls 
Response 
N % N % N % N % 

Bored or unhappy 50 22.0 100 24.0 15 10.0 44 12.9 
Lonely 79 35.0 135 32.0 | 68 45.3 123 36.2 
Afraid 17 7.5 35 8.3 10 6.7 20 5.9 
Better off 21 9.2 34 8.0 10 6.7 8 24 
Relaxed, thinking, or reading 55 | 24.0 100 24.0 | 43 28.7 122 35.9 
Happy 6 2.6 20 4.7 4 2.7 23 6.8 

x? = 3.04, ns. x! = 13.15, p < .05. 


and those in coeducational schools and that, 
even with the restrictions imposed by a 
remarkably uniform curriculum and visiting 
inspectors from the national office, they are 
able to make these differences felt. Since 
there is no central assignment of secondary 
teachers, this possibility cannot be ruled out. 
But the uniformity of curricula, salaries, 
facilities, and administrative philosophy 
would suggest that this is probably an un- 
likely source of differences. A simpler, more 
direct, explanation is that the adolescent 
social structure promoted by a coeducational 
system exerts a strong influence on ado- 
lescent values and on the attitudes these 
adolescents hold toward scholastic ac- 
tivities, themselves, and their peers and 
parents. 

That the adolescent society strongly 
influences its members, whether they attend 
coeducational or single-sex schools, seems 
clear enough. For a majority of the adoles- 
cents in this study, the disapproval of their 
friends is more serious than parental dis- 
approval, and a majority also indicated that 
maintaining status sometimes required that 
they go against their principles. 

There do, however, seem to be critical 
differences in the degree of pressure and in 
the end results of these social pressures for 
students in the two types of schools. Coleman 
(1961a) has suggested that, for the American 
adolescent attending secondary school, the 
school, especially for the small-town student, 
is the focal point of the adolescent society. 
He achieves status and prestige here or he 
fails to achieve it at all. The data from this 


study suggest that this is more likely to be 
true for those attending coeducational 
schools, even when these are urban schools. 
Out-of-school groups and activities were 
less important to them than to students 
attending single-sex schools. The pervasive- 
ness of the adolescent society for students in 
coeducational schools may account for the 
fact that, despite the social pressures shaping 
their behavior, they seemed relatively 
unaware of these effects. Girls attending the 
all-girls’ school, for example, were. less 
frequently truant than were boys in either 
type of school. Co-ed girls were not only 
more frequently truant but were truant at 
the same rate as the boys in their schools. 
Yet they indicated no more frequently than 
the girls in the single-sex school that social 
pressures sometimes required violating per- 
sonal principles. 

It also appears that the rewards and 
sanctions of the adolescent social system 
weigh more heavily on the girls than on the 
boys. Truancy rates for boys in the co-ed 
school do not come down, they remain at the 
same level as for boys in the single-sex 
school; rates for the girls go up in the co- 
educational school. And, in general, differ- 
ences on most items were larger between the 
two groups of girls than between the w 
groups of boys. When asked how they woul 
best like to be remembered at their schools, 
more boys in both groups rated being Ie 
membered as a brilliant student above being 
remembered for their popularity or their 
leadership in activities. This was also true 9 
girls at the all-girls’ school. However, €O 


1 
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girls placed both popularity and leadership 
in school activities above being remembered. 
as a brilliant student. And this orientation 
toward popularity and status in the social 
system apparently leads co-ed girls to de- 
scribe a person who is alone as being bored, 
unhappy, lonely, or afraid. Girls attending 
the all-girls' school most frequently described 
such a person as relaxed, thinking, reading, 
or happy. 

In what Coleman (1961b) has referred to 
as "the competition for adolescent energies,” 
scholastic pursuits seem not to fare as well 
in the coeducational school. Coeducational 
students not only were more frequently 
concerned with matters of popularity and 
prestige and with nonacademie activities, 
but were more frequently truant, reported 
spending significantly less time on home- 
work, and would, if given a free hour, be 
less inclined to spend it in studying. In 
what could be considered as a deviation 
from Coleman’s study, the boys at the 
all-boys’ school showed a greater interest 
than did the co-ed boys in athletics. This is 
perhaps the result of two factors: There may 
be fewer ways of achieving status in the 
all-boys’ school; and New Zealand tends to 
be a male-oriented society with a very 
strong interest in sports. The all-male 
environment of the boys’ school perhaps 
reflects the larger male society more than 
does the society of the coeducational school. 

any case, those co-ed students, particu- 
larly the girls, who thought of their high 
school years favorably, tended to describe 
them as “full of fun and excitement,” in 
Contrast to the students at single-sex schools 
who describes them as “interesting and hard 
Work." It may be that, for many coeduca- 
tional students, the academic side of school 
18 seen as interfering with the school’s ability 
to provide fun and excitement. 
, But are there advantages in the coeduca- 
onal system that outweigh some of these 
advantages? Do students attending & 
coeducational school like school better? 
oes coeducation result in a happier, better 
adjusted adolescent? Unfortunately, the 
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evidence indicates otherwise. Fewer found 
their high school years interesting and more 
rated them as dull or unhappy while a 
minority of co-ed girls thought of them as 
happy and exciting. A reasonably good 
indication of the extent to which a person is 
happy and self-accepting is the extent to 
which he wishes to change himself. A ma- 
jority of the adolescents of both sexes in this 
study would change some things about 
themselves and in this respect those at- 
tending the coeducational school were no 
different from those attending the single-sex 
schools. 

While there are obviously hazards in 
generalizing from one society to another, 
the data obtained in this study are con- 
sistent with those obtained by Coleman in 
his study of adolescents in American high 
schools, and the differences between the 
students in coeducational and single-sex 
schools are consistent with his assumptions 
about the effects of coeducation on American 
adolescents. It would seem, then, that one 
of the more cherished American educational 
beliefs, our belief in the value of coeduca- 
tion, is in need of reexamination. We should, 
perhaps, either change the nature of our 
educational organization or attempt to 
alter the social structure and the goals for 
which students are working in our coeduca- 
tional schools. 
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The purpose of this study was to determine whether preschool boys 
could resist sex-inappropriate behavior advocated by an esteemed 
woman teacher. Each subject first chose a toy to keep and stated 
the toy preference for the opposite sex. The teacher then advocated 
a sex-inappropriate toy choice. The child was free to resist, and social 
and nonsocial opportunities for supporting resistance were available 
to him. The hypotheses were confirmed that most boys would resist 
sex-inappropriate behavior, boys would more often exhibit resistance 
techniques than girls, and both sexes would choose sex-appropriate 
toys for boys more often than for girls. 


One criticism of the preschool program is 
that it exerts a strong feminizing influence 
on boys. Attractive resources are controlled 
by the teacher and are dispensed for con- 
formity to her demands. Since many of 
these demands involve feminine-preferred 
behavior (Fagot & Patterson, 1969), boys 
are likely to experience direct or vicarious 
reward for sex-inappropriate behavior. Two 
factors further heighten the feminizing effect 
of the teacher and, according to Bandura’s 
(1969) modeling theory, provide a combina- 
tion of model and subject characteristics that 
should facilitate the acquisition of sex-in- 
appropriate behavior: The boys’ cognitions 
for obeying the teacher are in conflict with 
their cognitions about exhibiting sex-appro- 
priate behavior and boys typically are less 
competent than girls in many preschool 
activities (McCandless, 1967). 

However, a powerful deterrent to the 
acquisition of any social learning is the 
effect of prior experience. According to 
Lynn’s theory of cultural identification 
(1959), boys are exposed from an early age to 
considerable direct and vicarious reward for 
sex-appropriate behavior and for being boys, 
and are punished for sex-inappropriate be- 
havior, so that by the late preschool years, 
sex-role preference is well established. Fur- 
thermore, preschool children of both sexes 


typically are aware that sex-appropriate 
behavior is expected of boys. 

If Lynn’s premises are accepted, it would 
follow that preschool boys should have 
acquired resistance to sex-inappropriate be- 
havior along with a repertoire of responses 
to support this resistance and that conse- 
quently, the feminizing effect of the teacher ` 
would not be as powerful as critics of the 
preschool suppose. 

The purpose of this study was to deter- 
mine whether preschool boys in the natural- 
istie setting of a preschool were able to 
resist the advocation of sex-inappropriate 
behavior by a woman teacher highly 68 
teemed by all the children. The experiment 
was also designed to identify the techniques 
used by boys for supporting resistance to 
sex-inappropriate behavior and to m 1 
between-sex comparisons of the frequency © 
resistance responses. s 

In the experimental procedure, each child | 
chose a toy to keep and stated the toy Es 
ence for a child of the opposite sex. E 
teacher then advocated a sex-inappropri? 1 
toy choice. The child was free to resist t 1 
teacher's choice and the experimental E 
tion provided him with opportur ties d 
supporting his decision. It was predicted af 
(a) most boys would resist the advocation d 
sex-inappropriate toy choices; (b) boys WOUE 4 


342 


RESISTANCE TO SEX-INAPPROPRIATE BEHAVIOR 


more often exhibit techniques of resistance 
than girls; and (c) both boys and girls would 
select sex-appropriate toys for boys more 
often than for girls. 


METHOD 


Subjects 


Sixty children (30 boys and 30 girls) enrolled in 
the Stanford University nursery school served as 
subjects. The school population was a homogene- 
ous one: the subjects were white middle-class 
children of average and above average intelligence, 
whose fathers’ occupations were professional, 
business managerial, or graduate study. Twenty 
boys were selected randomly and assigned to the 
experimental boys’ group, the remaining 10 boys 
formed the control boys’ group. The same proce- 
dure was used to assign the 30 girls to groups. The 
subjects ranged in age from 42 to 63 months with a 
mean of 55.40 months and a standard deviation 
of 5.27 months. The four experimental and control 
groups did not differ in chronological age. sä 


Preezperimental Toy Selection. 


Prior to the experiment, children from the 
nursery school selected the 12 toys that were to be 
in the experiment. Ten boys and 10 girls, serving 
only in this preexperimental toy selection, were 
each asked to select from a group of small inex- 
Pensive toys for boys, girls, and both sexes; the 
four toys in order of preference that (a) a child of 
his age and sex would choose, (b) a child of his age 
but of the opposite sex would choose, and (c) a 
boy and a girl of his age would choose if they were 
to play with the toy together. The 12 toys having 
the highest percentage of agreement were: boys’ 
toys—airplane (100%), train (95%), boat (95%), 
gun (90%); girls’ toys—necklace (100%), cradle 
and doll (100%), doll in bath (90%), tea set (90%); 
boys’ and girls’ toys—kaleidoscope (100%), flute 
(os! bubble pipe (90%), and mouth organ 

fo)» 


Experimental Procedure 


From these 12 toys, the experimental and con- 
trol subjects each selected in order of preference 
four toys for themselves and four toys for 
Opposite sex. Each of the 60 subjects was brought 
individually by the experimenter to the experi- 
Mental room, a room in which the children fre- 
quently played as part of the routine activity in 
the school. The experimenter was a female gradu- 
ate student well-known and liked by the children. 

e experimenter said, 


I'm trying to find out which toys children like 
est, so I’ve put some toys on the table and 

eu like you to tell me which ones you like 
est. 
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The subject and experimenter then looked at 
each of the toys, a leisurely procedure that in- 
volved handling, trying out, and talking about 
each toy, with the experimenter using a set of 
standard comments about each toy. She then took 
the vubisct away from the table, sat down, and 
Said, 


Now, you go and look at all the toys and bring 
me the one you like best. Later, when I can 
get enough of these toys, the teacher says I 
can give each of you one. Bring me the toy you 
like best. You can take as long as you like to 
choose. 


As soon as the subject had picked a toy, the ex- 
perimenter said, 


Now bring me the one you like next best. 


This procedure was repeated until the subject 
had selected four toys. The toys were returned to 
the table and the same procedure was used to let 
the subject select four toys for the opposite sex. 
The experimenter recorded each subject’s toy 
preferences, primarily for observer reliability 
purposes, but also as a sign of interest in the 
subject’s choice. An experienced graduate student 
also recorded each subject’s toy preferences from a 
concealed observation room. After each subject's 
turn, the toys were arranged in random order to 
avoid position effects. 

One week later, the same procedure was re- 

ted with all 60 subjects to establish the re- 
liability of the subjects’ toy preferences. The 
experimenter justified this second session by 
saying that she was going to give each child a toy 
next time she came and she wanted to be sure 
which toys to bring. All the subjects participated 
with enthusiasm. A 1 

Two weeks later, each of the 60 subjects picked 
one toy to keep. In the experimental groups, each 
of the 40 subjects was brought into the experi- 
mental room and the experimenter said, 


Today you can pick any toy you want to keep. 
See, p are all the toys and here (pointing to 
a second table) are lots and lots of each kind 
of toy. I'll sit here and you bring me the one 
you want to keep. You can take as long as you 


like to choose. 


‘As soon as the subject had selected a toy, the 

imenter wrote the subject’s name on a bag, 
rieng that the subject could have his toy 
when it was time to go home. As the experimenter 
put the toy in the bag, she paused, looked at the 


subject and said, 


know what would be fun? The 
Gs Athe out in the kitchen. Let's ask her to 
tell us which toy she thinks you Should take. 
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It'll be more fun if she doesn't know which one 
you picked so let’s keep it a secret. Now, I'll 
put your toy back on the table and I'll see if 
the teacher will come in. 


The imenter went to the door, called to 
the teacher, and as she entered handed her a small 
card that indicated which toy to pick. In all cases, 
the teacher picked the toy which had been the 
subject’s number one toy preference for the op- 
posite sex. The teacher selected the toy with 
enthusiasm, made four standardized comments 
about its suitability for the subject and then left 
the room. The experimenter then asked the sub- 
ject to bring her the toy he wanted to keep and 
this toy was put into his bag. 

Next, the experimenter showed the subject 12 
story books, each having a cover picture of one of 
the 12 toys. She identified the toy on each cover, 
and said, 


I want you to look at these books and bring 
me the one you would like to have read to you. 


As soon as the subject had selected a book, the 
experimenter said, 
Another boy (girl) is coming in to choose a 
toy. You've been looking at these toys so 


you'd be a good one to help him choose. You 
tell him which toy you think he should take. 


The experimenter brought another same-sex 
same-age child in and the subject told him which 
toy to take. A previous experiment (S. Ross, 1971) 
Showed that children could be trained without 
difficulty for this type of role. The subject then 
left the experimental room. 

In the control groups, each of the 20 subjects 
followed the same procedure with one important 
exception: the important adult advocating a dif- 
ferent choice was not introduced into the session. 


REsuLTS 


The hypothesis was confirmed that most 
boys (n = 15) would resist the advocation 
of a sex-inappropriate toy choice (x? = 5.0, 
df = 1,p < .05). Fourteen boys maintained 
their original sex-appropriate toy choice, 1 
Shifted to a different sex-appropriate toy, 
and the remaining 5 boys changed from their 
original sex-appropriate choice to the sex- 
inappropriate toy advocated by the teacher. 
All boys in the control group picked a sex- 
appropriate toy for themselves. 

Strong support was provided for the 
hypothesis that boys would more often use 
techniques to support resistance to change 
than girls (¢ = 3.09, p < .005, one-tailed) 
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although the number of girls resisting change 
(n = 14) did not differ from the number of 
boys. The resistance techniques included 
arguing with the teacher, derogating the 
teacher in her absence, seeking social support 
from experimenter, seeking social support 
through peer persuasion, and seeking non- 
social support in the form of the book choice, 

Evidence that the experimental manipula- 
tions elicited the resistance responses is 
provided by comparisons of the experimental 
and control groups. The control group sought 
less of the available social support (x* = 
19.41, df = 1, p < .001), less of the available 
nonsocial support (x? = 9.12, df = 1, p < 
01), and tended to seek less support on peer 
persuasion (x* — 3.13, df — 1, p « .08). 

The hypothesis that both boys and girls 
would pick sex-appropriate toys for boys far 
more often than they did for girls was also 
confirmed. All 60 subjects picked sex-appro- 
priate toys for boys, but they picked sex- 
inappropriate toys for girls far more often 
than appropriate ones (x? = 11.62, df = 1, 
p « .001). 

The temporal reliability of the subject/s 
toy choices was established by comparing 
their choices in each of the three sessions. 
Comparisons between the first and second 
sessions showed 77% agreement (n = 60) 
and between the second and beginning of the 
third session showed 100% agreement (n = 
60). It can be concluded that changes in toy 
choice at the end of the third session were the 
result of the experimental manipulations 
rather than of chance fluctuations in toy 
preference. 


Discussion 


Although the number of boys and girls 
resisting change to a sex-inappropriate toy 
choice did not differ, the reactions of ! “4 
boys differed greatly from those of the gir. 
The boys used far more resistance responses 
than the girls and showed considerab? 
anxiety. These findings are consistent ee k 
Lynn’s (1959) theory and also with n 
tinger's theory of cognitive dE 
(1957). According to the latter theory, T 4 
a child who has made a choice is expos eii 
high status powerful authority who adv! 
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cates a different choice the child will experi- 
ence psychological discomfort, that is, dis- 
sonance. The child may reduce his dissonance 
by seeking information to support his choice, 
discrediting the authority, distorting the 
content of the authority’s communication, 
and persuading others to make the choice he 
did. Boys are under greater pressure than 
girls to exhibit sex-appropriate behavior, so 
it follows from a dissonance theory explana- 
tion that boys would exhibit more resistance 
Tesponses and would experience anxiety in 
the situation. 

The source of the pressure to change their 
toy selection was the same for both sexes, 
but the importance of the advocated change 
differed. A sex-inappropriate toy might easily 
tesult in embarrassment and even punish- 
ment for a boy, but this is not the case for 
most girls (Lynn, 1959). Consequently the 
kind of resistance response made by the 
boys was, in almost all cases, directly con- 
cerned with the sex-inappropriateness of the 
toy advocated by the teacher. It was clear 
to the experimenter that many of the boys 
experienced anxiety about the teacher's 
advocation of sex-inappropriate behavior. 
Their attempts to discredit her were par- 
ticularly interesting in that they implied 
that her faculties were suddenly and tempo- 
rarily impaired. For example, she was ill 
(“Poor teacher, she must have a real bad 
throat”), or suffering from a sudden loss of 
Memory (‘Remember me? It’s your old 
friend Charlie the Cowboy. Say, I bet she’s 
Just forgotten me and that’s why she says 
take a necklace. She'll remember it’s me 
tomorrow."), or overworked (“Teacher has 
too much to do today. You [experimenter] 
shouldn’t ask her to do more things. Let’s 
Pretend we never asked her."). By contrast, 
the resistance responses of the girls reflected 
general dissatisfaction of a practical nature 
(“It wouldn't be fun to play with.” “My 
brother might take it.”) and they exhibited 
little or no anxiety at the teacher’s choice 
for them. 

The naturalistic setting of this study con- 
tained a number of potential external pres- 
Sures: the child was choosing a toy to keep 
and his choice was made among adults whom 
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he knew well and liked, furthermore, his 
choice would subsequently be known to the 
other adults and some of the children in the 
nursery school and to his own family, His 
toy choice was a function of his own sex-role 
preference and the pressures for sex-ap- 
propriate behavior that existed in his every- 
day environment. f 

In contrast, the experimental procedures 
used in previous studies of sex-role prefer- 
ence (Brown, 1956; DeLucia, 1963; Hartup 
& Zook, 1960; Hetherington, 1965; Rabban, 
1950; Ward, 1969) lacked the pressures for 
sex-appropriate behavior that prevail in 
everyday social situations. The experiment- 
ers were strangers or short-term acquaint- 
ances of the children, the test choices at 
best were of academic interest to the children 
since in no case were they to keep any object 
chosen, and finally, the stated sex-role prefer- 
ence was known only to the child and the 
experimenter. Thus, an important variable 
in sex-role development, external pressure to 
conform, was absent. Furthermore, the test 
instruments commonly used in establishing 
sex-role preference are themselves subject to 
criticism, as McCandless (1967) and Ward 
(1969) have pointed out. 
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LEARNING STRATEGY ON PERFORMANCE IN 
TWO UNDERGRADUATE PSYCHOLOGY CLASSES 


ROY D. GOLDMAN?! 
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Students in an undergraduate class in psychological statistics were 
classified into two “learning strategy” groups: a logical group, and 
a mnemonic group, The logical group received significantly higher 
grades on a number of academic criteria in two undergraduate classes. 
Criteria included course grades, test grades, laboratory grades, and 
term-paper grades. The strategy groups did not differ significantly 
on six ability measures, nor did removal of ability covariates reduce 
the performance difference between strategy groups. The specific 
pattern of ability-strategy correlations correlations differed for the 
two strategy groups, indicating different learning processes. These 
results suggested that it may be possible to seek “most efficient” 
strategies for given tasks, and possibly, “conditionally” most effi- 


cient strategies for individual ability profiles. 


Investigations have indicated that it may 
be possible to solve a given task by alterna- 
tive modes or strategies (Bruner, Goodnow, 
& Austin, 1956; Fredericksen, 1969; French, 
1965). Performance differences between indi- 
viduals on such a task might reflect differ- 
ences in strategy usage as well as ability 
differences. The task oriented investigation 
of problem-solving strategies by Bruner, 
Goodnow, and Austin demonstrated that 
different strategies were differentially effec- 
tive for a concept learning task. The rela- 
tively formal nature of the task, however, 
made the identification of strategies possible, 
but limited the generality of the approach. 

A correlative investigation of problem- 
Solving strategies further defined the nature 
of the ability-strategy relationship (French, 
1965). French stated, 


Some tests of higher mental processes are solved 
m one way by some subjects and in another by 
other subjects. This means that the tests may be 
Measuring different abilities for some subjects 
than for others [p. 9]. 


„Tt was hypothesized that subjects using 
different strategies of test taking would ex- 


q, Requests for reprints should be sent to Roy D. 
oldman, Department of Psychology, University 
of California, Riverside, California k 


hibit different patterns of ability-perform- 
ance correlations. This hypothesis was well 
supported; when subjects were classified into 
two groups on the basis of their "analytic 
attitude" toward a set of problems, a 
unique pattern of factor loadings emerged 
for each group. Frederiksen was able to 
demonstrate a complex set of relationships 
among abilities, strategy usage, and per- 
formance on several verbal learning tasks. 
He found that the correlations between a set 
of ability measures (associative memory, 
vocabulary, associational fluency, etc.) and 
performance measures (number of words 
learned per trial) were heavily moderated 
by strategy choice. In addition, prediction 
of performance under different task condi- 
tions was improved by knowledge of a sub- 
ject’s strategy choice. In both the French 
and Frederiksen studies, strategy choice was 
assessed through self-report questionnaire 
methods. : 
The present study is an attempt to in- 
vestigate the effects of strategy usage upon 
performance on academic criteria, namely, 
grades in two undergraduate classes. Two 
deductions from Frederiksen's "differential 
process theory” are that different strategies 
have different efficiencies for a given task; 
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and, strategies moderate the transfer from. 
abilities to performance. If these deductions 
are correct, then individuals using different 
learning strategies should have different 
levels of academic performance; and, the 
importance of specific abilities in determin- 
ing level of performance should be different 
from individuals using different strategies. 


MzrHoD 


Subjects 


Subjects were 67 undergraduate students en- 
rolled in a psychological statistics class at the 
University of California, Riverside. (This class is 
required for the psychology major.) All subjects 
also were enrolled in a class in experimental psy- 
chology during this period. Course, laboratory, 
term-paper, and test grades from the two classes 
were used as multiple-performance criteria. 


Ability Measures 


All subjects were tested on six ability tests 
from the 1903 edition of the Kit of Reference Tests 
for Cognitive Factors (French, Ekstrom, & Price, 
1963). The specific tests, (Mathematics Aptitude 
[R-2], Advanced Vocabulary [V-4], Inference 
[Rs-2], Letter Sets [I-1], Division [N-2], and 
Object-Number [Ma-2]) represented the factors 
of general reasoning, vocabulary, syllogistic 
rensoning, induetion, number facility, and asso- 
ciative memory. These tests were chosen because 
they seemed to represent the processes required 
by the set of criteria. Although these tests have 
paralleled halves, only one was administered 
from each test because of time constraints. 

The ahi - were facets of the course 
Tequirements for the two psychology classes. In 
the Statistics class the criteria were (1) the sum of 
weekly quiz grades, (2) a score on a conceptual 

‘examination, (8) a score on a computational 
final examination, and (4) a grade for the course. 
In the experimental psychology class, the criteria 
were (5) the sum of the midterm and final exami- 
nations, (6) a grade for the laboratory section, (7) 
a grade for an independently composed research 
proposal, and (8) a grade for the course. Criteria 
4, 6, 7, and 8 were letter grades on the familiar 
0-4 grade point scale. The other criteria were 
based upon continuous scales. Brief descriptions 
aie gyen below. 

eekly quiz total for statistics class. This 
the sum of the seven highest (out of Mee] 
nation scores. The examinations Were each ap- 
proximately 25 minutes long and covered the 
previous week's material. Some examinations 
were computational and others were conceptual. 

Conceptual examination score for statistics class. 
This was a 1-hour examination, covering the con- 
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tent of the whole course. Most questions could be 
answered formally (algebraically), inductively 
through examples, or in a looser verbal way. 

Computational examination score for statistics 
class. This was a 2-hour examination which Te- 
quired the student to solve numerical problems, 
Numerical answers were expected with verbal 
explanations as supplementary information. 

Course grade for statistics class. This grade was 
based upon the weighted total of the weekly quiz 
grades, the conceptual final examination and the 
computational final examination. 

Midterm/final total for experimental psychology 
class. The midterm examination consisted of a 
number of identification items and short answer 
questions. The final examination consisted of a 
description of two experiments to be criticized by 
the student, on the basis of adequacy of design. 

Laboratory grade for experimental psychology 
class. This grade was assigned by the teaching 
assistant for each laboratory section on the basis 
of seven written laboratory reports, and his eval- 
uation of the overall adequacy of performance. 

Research proposal grade. This grade was as- 
Signed by the teaching assistant on the basis of 
the adequacy of a student designed research 
VORNE The choice of topic was left to the stu- 

ent. 

Course grade for experimental psychology class. 
This grade was based upon performance on the 
examinations, the research paper, and the labo- 
ratory section. 

Strategy assessment questionnaire. A series of 
interviews with students indicated to the experi- 
menter that three basic approaches were used in 
the learning of statistics; a formal algebraic ap- 
proach, a looser verbal-logical approach, and a 
rote memory approach. The questionnaire pre- 
sented below was administered to all subjects. Re- 
Sponses on this questionnaire were used to classify 
Subjects into strategy groups. To minimize re- 
Sponse bias, subjects were assured that the ques- 
tionnaires would not be scored until after the 
finish of the semester. By: 

Strategy questionnaire. This questionnaire 1$ 
designed to investigate some aspects of academic 
activities that may be related to class performance 
but are rarely measured. 

Strategies: It appears likely that people use 
different techniques to learn statistics. Pre- 
sented below are three possible classifications. 
Although there may be other ways of clas- 
sification, check the one that best describes 
your strategy. 


1, Mathematical-Formal: Try to learn al- 
gebraic derivation of each statistical tech- 
nique. 

2. Logical-Formal: Try to learn the d 
lying reasons for the technique in a verb! 
way. üi 

3. Mnemonic Concrete: Try to learn the 
computational techniques by obser De 
examples often without worrying abou! 
reasons for the technique. 
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Design 

Subjects were classified into strategy groups by 
sex. The mathematical-formal strategy was so 
rarely chosen (three males, one female) that it was 
combined with the logical strategy group. Thus & 
2X 2 (Sex X Strategy) design was used. The num- 
ber of subjects in each cell is shown in Table 1. 


TABLE 1 
NuwnER or SuBiECTS IN EAcH Grove 
Subjects Logical Mnemonic 
Male 26 16 
Female 9 16 


Since the two factors of classification are not in- 
dependent (x? = 4.2, df = 1,p < .05) the design 
is nonorthogonal. 


RESULTS 


Reliabilities and Intercorrelations of Measures 


The order of administration, means, stand- 
ard deviations, and estimated reliabilities 


TABLE 2 


RELIBILITIES, MEANS, AND STANDARD 
DEVIATIONS OF ABILITY MEASURES 


Measure pon SD 
a eee! 
Mathematics .78 2.36 
Vocabulary — (ad- 
vanced) .59 2.09 
Letter sets .66 2.69 
Inference .52 1.61 
Object number .70 3.66 
Division .96 8.59 


Mimio Loita) dl aie ane 
Note.—N = 67. Reliabilities computed by odd- 


ies Split-half method, corrected for double 
*ngth by the Spearman-Brown formula. 
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of all ability measures are presented in 
Table 2. The intercorrelations of the ability 
Measures and the intercorrelations of the 
criteria are presented in Tables 3 and 4. It 
can be seen from Tables 3 and 4 that the 
criteria are highly intercorrelated while the 
ability measures are not. 


Comparisons of Stategy Group Centroids 


To handle the nonorthogonal nature of 
the design, all analysis of variance contrasts 
were made last. That is; all F-ratio com- 
parisons were made in stepwise fashion in 
which the contrast made after all the others 
was unbiased. By reordering the series of 
contrasts the strategy main effect, sex main 
effect, and Sex X Strategy interaction could 
all be tested last, and thus be unbiased by 
the nonorthogonality of the design. 

Academic criteria. 'The two strategy groups 
were compared on the basis of their per- 
formance on the eight academic criteria. The 
multivariate analysis of variance, using 
Rao's approximation to the F ratio, indi- 
cated a significant difference between the 
two groups (F = 2.21, df = 8/56, p = .04). 
(See Table 5). The univariate F ratios for 
each of the variables also show this pattern 
(Table 5). A multivariate comparison be- 
tween males and females indicated no signifi- 
cant differences between performance cen- 
troids (F < 1). Similarly the Sex X Strategy 
interaction was also nonsignificant (F = 1.4, 
df = 8/56, p < .25). 

Ability measures. The two strategy groups 
were compared on the six ability measures 
(Table 6) with a result (F = 1.26, df = 
6/58, p < 30) indicating no significant 
difference in ability centroids between the 
two strategy groups. There were no signifi- 


TABLE 3 


(ai Sa 


l. Mathematics 
2. Vocabulary 
3. Induction 

* Inference 

- Memory 

* Computational Speed 
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TABLE 4 
INTERCORRELATIONS OF CRITERIA 


Grade 


1. Quiz (Statistics) 


2. Conception (Statistics) .65 — 

8. Computation (Statistics) .62 .61 — 

4. Course (Statistics) .40 .74 75 — 

5. Exam (Experimental) .43 .62 .49 53 — 

6. Lab (Experimental) .28 97 .50 44 -53 — 

7. Paper (Experimental) .28 .30 .43 .40 .52 .63 — 

8. Course (Experimental) .49 57 .49 50 .75 .84 Nui — 


Note.—N = 67. For p < .05, r = 23; for p < .01, r = .30. 


cant sex differences (F = 1.12, p < .35) or groups, a multivariate analysis of covariance 
sex by strategy (F < 1) differences in was performed; academic performance meas- 
ability centroids. Although the strategy ures were considered as dependent variables 
groups did not differ significantly in abilities, and ability measures as covariates (Table 7). 


TABLE 5 
COMPARISON or STRATEGY GROUPS Cenrroms FOR PERFORMANCE MEASURES 


Discriminant function 


Means Univariate coefficients 
Logical Mnemonic F ? Raw score | Standardized 
Quiz (Statistics) 72.14 64.03 5.59 .021 .02 ES 
Conception (Statistics) 23.57 18.06 12.80 .0007 2 .92 
Computation (Statistics) 30.83 25.76 4.70 .084 .01 —.12 
Course (Statistics) 3.11 2.43 7.81 .008 —.35 =.41 
Exam (Experimental) 68.43 64.19 5.04 .028 —.02 —.19 
Lab (Experimental) 3.05 2.43 9.90 .002 .82 81 
Paper (Experimental) 3.05 2.78 2.59 .12 —.08 -M 
Course (Experimental) 3.14 2.68 7.94 006 —.14 —.12 


Note.—Multivariate F = 2.21, df = 8/56, p < .04. 


it is not true that there was no difference in It appears from Table 7 that the difference 
abilities between these two groups. To test in performance measures between the two 
for the possibility that performance differ- strategy groups is virtually identical to that 
ences between strategy groups might be due shown in Table 5 in which no reduction for 
to some ability differences between these covariates was employed. 


TABLE 6 
Comparison or STRATEGY GROUP CONTROLS or ABILITY MEASURES 


Ability measures 


Mathematics 
Vocabulary 
Inference 
Induction 
Number 
Memory 


Note.—Multivariate F = 1.26, df = 6/58, p < .30. 


O_O a 


LOGICAL VERSUS A MNEMONIC LEARNING STRATEGY 


TABLE 7 
COMPARISON oF STRATEGY GROUP CENTROIDS FOR 
PERFORMANCE MEASURES WITH ABILITY 
Measures REMOVED AS COVARIATES 


Univariate | |, Discriminant 
Performance measures 
Raw | Stand- 
Score | ardized 
Quiz (Statistics) .02 30 
Conception (Sta- 
tistics) 10 4 
Computation (Sta- 
tistics) -00 | —.02 
Course (Statistics) —.23 | —.23 
Exam (Experi- 
mental) 01 .13 
Lab (Experimen- 
tal) 9.36 | .003 | 1.15| 1.13 
Paper (Experimen- 
tal) 1.24 | .269 | —.20 | —.21 
Course — (Experi- 
mental) 5.63 | .021 | —.74 | —.63 


Note.—Multivariate F = 2.18, df = 8/50, 


p< .04. 


Strategy Group Differences in the Relationship 
of Abilities to Performance Measures 

The within-strategy group correlations of 
abilities with performance measures are pre- 
sented in Table 8. For the logical group, there 
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seems to be little systematic relationship 
between ability and performance measures 
(with the possible exception of significant 
correlations between syllogistic reasoning 
and two of the criteria). For the mnemonic 
group, however, there seems to be a some- 
what different pattern of ability—perform- 
ance relationships. The associative memory 
ability is a significant predictor of per- 
formance on most of the criteria. In addi- 
tion, induction and general (mathematical) 
reasoning are significant predictors of the 
statistics class criteria. 


Discussion 


It appears from Table 5 that the choice of 
a logical or mnemonic learning strategy did 
indeed affect performance on the academic 
criteria which were studied. Since strategy 
usage was surveyed rather than manipulated, 
it is rather difficult to demonstrate a direc- 
tion of causality. It should be noted, how- 
ever, that the elimination of ability differ- 
ences between the strategy groups (through 
analysis of covariance) did not diminish the 
difference in academic performance between 
these two groups. Thus, it appears that 
strategy choice exerts effects upon perform- 
ance, independently of the effects of abilities. 


TABLE 8 
CORRELATION or Anrtiry MEASURES WITH PERFORMANCE MEASURES 


Statistics 


Experimental 


Ability Measures ei 
Quiz |Conception|compatation | Course Exam l Tab | Tat i| Eon 


Logical group 
A e e a grote E EA a] 
Mathematics 1 .19 Ed 25 QM | —.20 mæ pit +22 
Vocabulary do | io 108 n |o |-e | 0-00 | ~.01 
Inference .43* | .16 .80 43* -01 =.01 mes a 
Induction .81 Br +26 30 sl s 07 «04 

umber ‘05 | 00 "s 19 | 0.00 34 MY 
Memory —.06 18 —.08 —.08 04 a 


Mathematics 


Vocabulary .25 48 

Inference .32 .31 à 

Induction .41* .46** +32 
umber .09 .12 .21 

Memory .22 .42* .d* 


j ; 102| 08 

+ .26 .18 04 31 

B doo oae ‘os | 1T 
fa 28 Ap .54** .26 .Ab** 


Noto.—N = 35 in the logical group and 32 in the mnemonic group. 
05. 
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If strategies mediate the transfer from 
abilities to performance as hypothesized by 
Frederiksen, then there should be different 
patterns of correlations between abilities and 
performance measures for the different strat- 
egy groups. Clearly, this was the case, as 
evidenced by Table 8. The consistent corre- 
lation of associative memory with perform- 
ance measures for the mnemonic group 
indieates that the adoption of a mnemonic 
Strategy places considerable valence upon 
memory abilities. In the logical group, there 
was no correlation of memory with per- 
formance. It should also be noted that the 
mnemonic group showed correlations be- 
tween induction and performance, while the 
logical group occasionally showed correla- 
tions between syllogistic reasoning and per- 
formance. 

There was a difference between the pattern 
of ability-performance correlations for the 
two courses (experimental psychology vs. 
statistics) as shown in Table 8. In general, 
there were correlations between ability meas- 
ures and performance for the statistics 
course but not for the experimental psy- 
chology course. There was also an apparent 
interaction between strategy and course as 
moderators of the ability-performance corre- 
lations. In light of the small sample size, 
relative to the number of correlations, one 
should view this relationship warily. 

This study provides both pragmatie and 
theoretical information, Pragmatically, it 
indicates that there is an effectiveness differ- 
ence between strategies for learning statistics 
and experimental psychology. In principle, 
this approach might be employed to seek 
optimal strategies for a wide range of tasks. 
The discovery of most efficient strategies for 
tasks could provide a direction for educa- 
tional approaches to those tasks. In addi- 
tion, the pattern of within-strategy group 
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correlations would indieate that ability 
acteristics of individuals might render 
Strategies differently effective. For exa 
subjects who were very good at associa 
memory would be expected to perform b 
using the mnemonic strategy than subj 
who were poor at associative memory, 
might be possible, therefore, to discover 
strategies that are not only most efficient 
for a given task but that are conditionally 
most efficient for that task given a certain 
ability profile. Ke 
Theoretical information is provided 
this study in that it provides additional 
support for the construct validity (Cron 
bach & Meehl, 1955) of “strategies.” The — 
construct of strategy enters into a nomo- 
logical network in two ways: strategies 
to produce performance differences between . 
subjects and strategies seem to moderate - 
the transfer (as evidenced by correlations) 
from abilities to performance. The results of 
this study, like those reported by : 
(1965) and Frederiksen (1969) support 
utility of including strategies in the study: 
learning and information retrieval. 
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KINDERGARTEN CHILDREN’S USE OF SPATIAL-POSITIONAL, 
VERBAL, AND NONVERBAL CUES FOR MEMORY' 


DAVID A. CORSINI 


University of Georgia 


The ability of kindergarten children to use different types of stimu- 
lus information in a memory task was examined. Sixty-six kinder- 
garten children performed a memory task under one of six conditions. 
The conditions differed as to the type and number of stimulus cues 
(verbal, concrete, and spatial positional) presented to the child for 
the purpose of forming a memory code. The results showed that 
kindergarten children remember best under conditions in which both 
verbal and nonverbal stimulus cues are available. There was only 
minimal evidence that these children used spatial-positional cues 
for memory purposes. The results were diseussed with reference to 
developmental changes in the ability to form internal representations 


of stimulus events. 


A major task for the educator is to 
structure learning situations in ways that 
provide for the maximum retention of infor- 
mation, This is a complex task requiring the 
simultaneous consideration of several factors. 
Three factors involved are: the conceptual 
complexity of the information, the cognitive 
capabilities of the learner, and the manner 
in which the information is presented to 
the learner. The present study is concerned 
with the ability of kindergarten children, 
with their particular cognitive capabilities, 
to use verbal and nonverbal cues in a 
Memory task. The specific questions asked 
by the present study can be put in per- 
Spective by briefly discussing the 
theoretical framework and results of two 
previous studies. (Corsini, 1969a, 1969). 

Both Piaget (1947, and cf. Flavell, 1963) 
and Bruner (1964, 1966) have suggested 
that the representation of information by 
Means of abstract symbols, such as words, is 
arelatively advanced form of representation. 
Piaget refers to the developmentally earlier 
type of representation as imagistic represen- 
tation, while Bruner refers to it as “ikonic” 
representation. One of the important distinc- 
lions between this type of representation 
=. 

‘This research was supported in part by the 
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Development Center in Educational Stimulation, 
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and the more advanced type is its nonver- 
bal quality. This presumed developmental 
change in representational ability suggests 
the possibility of developmental changes in 
the ability to use different types of stimulus 
cues. For example, the ability to use non- 
verbal stimulus properties, such as color and 
form, as a basis for representation may 
precede the ability to use verbal stimuli 
(e.g., words). 

Tf this were the case, the frequently re- 
ported memory limitations of young children 
may stem primarily from an inability 
to represent abstract-symbolic information. 
The memory limitation of young: children 
might not be as great as expected when the 
to-be-remembered information is presented 
in a form that is more amenable to the 
young child's dominant representational 


mode—imagery. ^ ; 

The results of the previous. crier i 
largely supported these ons. e 
first of inis studies (Corsini, 19692), the 
retention of kindergarten children was ex- 
amined under four conditions. One required 
the child to listen to a verbal instruction and 
then to perform the manipulations requested. 
In the other three conditions, the child was 
given verbal instructions but the whole 
spatial display of objects was perceptually 
available. These latter three conditions 
differed from one another in the degree to 
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which the child's attention was directed to 
the stimuli relevant for performance. In 
general, when compared to the Verbal Only 
condition, retention was superior in all three 
of the conditions where the objects were 
perceptually available during the instruc- 
tions. 

While it was advanced that the availa- 
bility of the nonverbal cues during the 
instructions allowed the kindergarten sub- 
jects to form better memory codes of the 
instructions, at least three possibilities could 
account for the observed differences. One 
of the possibilities was that by seeing the 
display of objects, the subjects might have 
noted the positions of the relevant objects 
and performed well by maintaining atten- 
tion to the objects and/or positions until 
performance was begun. If a subject was 
performing the task by maintaining atten- 
tion to the relevant objects, he need not have 
formed an internal representation of the 
instruction which could be preserved through 
spatial and temporal displacements. Since 
the main concern of this research program is 
with the ability to form internal representa- 
tions, the possibility of correct performance 
by this type of attentional process was 
eliminated in the second and present studies. 

Better performance might also have oc- 
curred because the presence of nonverbal 
cues allowed the subjects to form better 
internal representations. From the nature 
of the conditions in the first study, two 
conceptually different type of codes could 
have been formed: (a) a code that was 
independent of the particular spatial ar- 
rangements of the objects; and (b) a code 
that was dependent upon, or at least in- 
cluded code elements for, the spatial posi- 
tions of the objects. 

The second study (Corsini, 1969b) ex- 
amined developmental changes (preschool 
and second-grade subjects) in the ability to 
form a memory code, independent of spatial 
arrangements, as a function of verbal and 
nonverbal stimulus cues. When superior per- 
formance by maintenance of attention and 
coding of spatial position was eliminated, 

the preschool subjects still performed signifi- 
cantly better when both verbal and non- 
berbal cues were present than when only 
verbal cues were present. Performance of the 
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second-grade subjects was essentially un- 
affected by the different conditions. 

The present study was designed to explore 
the ability of kindergarten children to use 
memory codes that were dependent upon, or 
at least included code elements for, the 
spatial positions of stimulus objects. As in 
the second study, the possibility of successful 
performance by maintenance of attention 
was eliminated. 


Mernop 
Conditions 

Two of the six conditions, Verbal Only and 
Verbal Concrete Object, were identical to those 
used in a previous study (Corsini, 1969b). 

Condition V (verbal). Instructions were given 
using only words. The subject was asked to listen 
carefully while the experimenter told him what to 
do. After hearing the instruction, the subject 
turned to the second table and attempted to 
perform the task. This condition assessed the 
subject’s ability to deal with verbally presented 
information. 

Condition VCO (verbal concrete object). Instruc- 

tions were given verbally as in Condition V, but, 
when each object was mentioned, concrete ex- 
amples were held up for the subject to see. For 
example, if the instruction was, “Put the blue car 
into the red box,” the experimenter held up a blue 
car when he said “blue car” and a red box when he 
said “red box." Thus, a subject received both 
verbal and concrete-perceptual cues. After hearing 
the instructions and seeing the objects, the subject 
turned to the second table and attempted to 
perform the task. This condition assessed the 
degree to which adding a simple nonverbal cue 
aided the retention of a verbally presented instruc- 
tion. 
The remaining four conditions were concerned 
with investigating the subject’s ability to make 
use of cues available in a spatial display. Instruc- 
tions were given nonverbally with the whole spa- 
tial array of objects visually before the subject: 
Since all subjects were given instructions while 
standing before one table and performed the tasks 
with objects situated on another table located 90 
degrees to the subject’s right, an internal memory 
code was necessary for successful performance. 

If a subject was to make use of the spatial array 
for his internal code, he could do it in two ways: 
First, while watching the instruction being given: 
the subject could code for himself something 
nonspecific such as, ‘The item on the bottom-T!i 
goes into the container at the top-left.” It is a 
suggested that a subject would necessarily Ben 
balize this code to himself but rather that his is: 
would be functionally equivalent to this sta : 
ment. The code itself might be an imagistic n 
resentation of the action performed by the Or S 
menter in making a particular placement. 


| 
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type of code would lead to successful performance 
under conditions in which the spatial arrangement 
of specifie objects was exactly as it had been when 
the code was made. However, if the spatial ar- 
rangement of the specific objects was changed 
between the time the code was made and the time 
performance was required, performance using 
this nonspecific code would not be successful. 

A second possibility is that the subject could 
form a code similar to that just described, but 
which additionally coded the specific objects that 
were to be manipulated, for example, red car, blue 
cup, ete. If the internal code contained repre- 
sentations of specific objects, successful perform- 
ance would not depend upon the same spatial 
arrangement of objects during the instruction and 
performance periods. 

Condition NV I (nonverbal identical). In this 
condition the instructions were given without 
words. All objects involved in the instructions 
were placed systematically on a circular piece of 
wood. The subject was told to watch carefully as 
the experimenter manipulated the objects (the 
instruction) so that he, the subject, could perform 
the identical manipulations. After the experi- 
menter had performed the manipulations for an 
instruction, the subject turned to the second table 
and attempted to perform the manipulations with 
the set of objects that was there. In this condition 
the spatial arrangement of objects during a sub- 
ject’s performance was identical to the spatial 
arrangement of objects when the instructions had 
been given. Since the performance was carried out 
with a different set of objects in a different lo- 
cation, the subject could not perform successfully 
by a simple attentional process, but rather had to 
form a representation which could be transported 
through space and time. However, since the spatial 
arrangement of objects was not changed, success- 
ful performance could be accomplished with a 
Nonspecific code; that is, a code which noted only 
Spatial positions and movements but not the spe- 
cific characteristics (color or type) of the objects. 

Condition NV D (nonverbal different). This 
Condition was essentially the same as Condition 
NV I with the exception that during the per- 
formance period the stimulus objects were ina 
different spatial arrangement from that during 
the instructions. Thus, if a subject had formed a 
Nonspecific representation of the instruction to be 
carried out, the representation would not help him 
during his performance. By comparing the per- 
formance in this condition with that in Condition 

„I, it would be possible to determine whether 

Subjects formed specific or nonspecific representa- 

tions when a spatial array of objects was visually 

feat and the instructions were given nonver- 
y. 

Condition VNV I (verbal nonverbal identical). 
This condition was identical to Condition NV I 
With the exception that as the instructions were 

. Bven nonverbally, the experimenter also gave the 
nstructions verbally. 

Condition VNV D (verbal nonverbal diferent). 

This condition was identical to Condition NV D 
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with the exception that as the instructions were 
Biven nonverbally the experimenter also gave the 
instructions verbally. These last two conditions 
assessed whether adding verbal information to 
nonverbal information would result in a different 
Tepresentational strategy by the subject; that is 
when verbal information was provided, did sub- 
jects tend to form more specific codes? 
Subjects 

The sample consisted of 66 kindergarten chil- 
dren with a mean chronological age of 56 months. 
Eleven subjects were assigned to each of the six 
conditions with an approximately even distri- 
bution by sex. The subjects were obtained from 
the University of Georgia Laboratory nursery 
school and from a church-sponsored nursery 
school near the University. The sample of children 
was primarily Caucasian. They were from families 
whose educational and occupational background 
was predominately middle class. Most of the 
subjects had at least 1 year of previous schooling. 


Stimuli 


The stimuli consisted of toy cars, buttons, 
cups, and boxes. Each of these objects was pre- 
sented in three colors—red, yellow, and blue. For 
performance, the cups and boxes were arranged in 
a semicircle on the table in front of a subject with 
the buttons and cars placed within the semicircle. 
The arrangement of the stimuli was systematically 
determined dependent upon the conditions of 
instructions as explained above. An additional set 
of identical objects was used to give instructions 
under some conditions. 


Instructions 


Each subject received four instructions which 
differed in difficulty. The order of instructions was 
always easiest to most difficult. Examples of the 
instructions are: 

Level 1—Put the red car into the yellow cup. 

Level 2—Put the yellow car and the blue 
button into the red box. 

Level 3—Put the yellow button into the red 
box and the blue car into the yellow cup. 

Level 4—Put the blue car and the red button 
into the yellow box and the yellow car into 
the blue cup. 


Procedure 

A young female experimenter performed the 
manipulations with all the subjects. They were 
brought into the experimental room and asked to 
name the objects and colors involved in the in- 
structions. Five subjects were eliminated and 
replaced on the basis of being unsure of their 

Los t 

o subject was shown the experimental 
materials, told that the experimenter was going to 
ask him to do different things, shown where to 
perform the tasks, and given a practice trial (a 
Level 1 instruction). The practice trial was given 
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in the same manner as the other instructions would 
be given. The experimenter corrected the subject 
if he had not performed correctly. Each subject 
was then given four instructions. The subject was 
complimented on his performance on each trial. 


Scoring 

Bach of the objects and each of the receptacles 
that were used by the subject in his performance 
was given a score of 1if they had been mentioned 
in the instruction. If the manipulated object 
receptacle was completely correct, an additional 
point was given. Although this additional point 
may seem artificial, this scoring was necessary to 
distinguish between partially correct and com- 
pletely correct performance. For example, in a 
Level 3 instruction a child could have correctly 
retained the four stimulus objects but have gotten 
the placements reversed. If just the correctness of 
objects and not their placement was scored, this 
performance would not be distinguished from 
that which retained the correct object-receptacle 
relationships. Thus, the highest score for each of 
the levels was 3, 4, 6, and 7, respectively. 


Resvuits 


A 6 (Condition) X 4 (Level) analysis of 
variance was performed on the scores. This 
analysis revealed a significant condition 
effect (F = 5.50, df = 5/60, p < .001), a 
significant level effect (F = 3.32, df = 3/180, 
p < .025), and an insignificant Condition X 
Level interaction. The significant level effect 
can be attributed to the fact that a subject’s 
possible score increased from Level 1 to 
Level 4, and the subject’s performance re- 
flected this possibility. 

The mean performance scores for the 
different conditions are as follows: Verbal— 
8.08; Verbal Concrete Object—11.20; Non- 
verbal Identical—6.72; Nonverbal differ- 
ent—6.80; Verbal Nonverbal Identical— 
11.44; Verbal Nonverbal Different—9.80. 
A Neuman-Keuls analysis of the differences 
between these means revealed the following: 
the performance in Verbal Nonverbal Differ- 
ent condition did not differ significantly from 
any other condition; performance in the 
Verbal Nonverbal Identical condition did 
not differ from performance in the Verbal 
Concrete Object condition, but performance 
in both these conditions was significantly 
better than performance in the Verbal, Non- 
verbal Identical, and Nonverbal Different 

conditions; performance in the latter three 
conditions did not differ significantly from 
each other. 
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Discussion 


Of particular interest in the present study 
were the four conditions that made spatial- 
positional information available during the 
instructions. Because the instructions were 


performed in a new spatial location, a sub- 
ject could not use the spatial-positional 
information without forming an internal code 
resistant to spatial and temporal displace- 
ment. Indications as to whether these kinder- 
garten children used spatial positions and 
movements as part of their memory code 
was assessed by comparing the performance 
in the Nonverbal Identical conditions with 
performance in the Nonverbal Different 
conditions and performance in the Verbal 
Nonverbal Identical with performance in 
the Verbal Nonverbal Different conditions. 
Memory codes which coded specific spatial 
positions and movements would facilitate —'! 
performance in the Identical conditions but 
would not facilitate—or would interfere— 
with performance in the Different conditions. 
If subjects were coding specific spatial posi- 
tions and movements, superior performance 
in the Identical as opposed to the Different 
conditions would be expected. ‘ 

The only indication that representation j 
based on the spatial elements was used in the 
Nonverbal conditions was the nonsignificant 
decrease in the performance in the Verbal 
Nonverbal Different condition as compared 
to the performance in the Verbal Nonverbal 
Identical condition. This decrease, which 
made performance in the Verbal Nonverbal 
Different condition nonsignificantly different 
from any other condition, raises the poss | 
bility that some preschool subjects can an 
do use spatial positions and movements 8 
part of a memory code. Stronger evidence 
for the ability of somewhat older children to 
use this type of coding strategy has been } 
obtained by the present author (Corsini, 
1970b). 

The relatively poor performance of these 
preschool subjects in the Verbal condition 
and relatively good performance M the 
Verbal Concrete Object condition supporté 
similar findings presented in three © d 
studies (Corsini, 19692, 1969b, 19708). di 

In some respects the performance of the © 
preschool subjects seems quite strange. When 
only verbal information is presented, per 


CHILDREN’S USE OF CUES FOR MEMORY 


formance is poor. This can be understood on 
the basis of their limited representational 
abilities with respect to symbolic material as 
discussed by Piaget and Bruner, However, 
it would be additionally expected that the 
performance of preschool children would 
improve when the stimulus information was 
more amenable to their dominant represen- 
tational abilities; that is, when the stimulus 
information is concrete. Thus, if the pre- 
schoo) child is able to use imagery, he should 
perform well in the nonverbal conditions. 
However, the preschool child performs well 
only when both verbal and nonverbal stimu- 
lus components are present, When presented 
with only nonverbal instruction, the pre- 
school child fails to code it even though he is 
or should be able to do so. The verbal com- 
ponent of the instruction appears to function 
as a cue to the child to code something. How 
well he codes the instruction is dependent 
upon the type of stimulus information avail- 
able to him; that is, he will be fairly success- 
ful if there is nonverbal stimulus material 
available to him but not successful if only 
the verbal instruction is given. If the 
verbal component is not present, the pre- 
school child appears to fail to actively code 
the instruction even though he should be 
capable of coding from the nonverbal ma- 
terials. Similar failures of preschool children 
to function to the best of their capacity have 
been observed by other investigators (Fla- 
vell, 1970). This apparent failure of young 
children to perform as well as they might 
Temains an area in need of conceptual 
clarification. 

, The pattern of performance across condi- 
tions further indicates that one of the ways 
In which children learn to comprehend 
Verbal information is by its pairing with a 
Concrete referent. The pairing of verbal cues 
with specific concrete referents is one way in 
Which verbal cues come to have meaning by 
themselves. Not only is the pairing of verbal 
and nonverbal cues important for the child 

learn what verbal cues mean but it is also 
Probably important for the development of 
the spontaneous use of internal verbalization 
™ problem-solying situations. The spon- 
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taneous use of internal verbalization has 
Often been found to be associated with 
higher level problem-solving behavior, 

The present work gives a theoretical refer- 
ence to help teachers of the young to under- 
stand why certain methods of communicat- 
ing information work more effectively than 
others; developmental changes in represen- 
tational capabilities. However, for any given 
situation the probability of success is an 
interaction among the conceptual complexity 
of the information, the conceptual capabili- 
ties of the learner, and the manner in which 
the information is presented. For example, 
it is also probable that the same type of 
effect as found in the present study (i.e., 
better retention of concretely supported 
information) would be found with adults in 
situations where the presented information 
was conceptually difficult. Thus, when edu- 
cators present new information to learners, 
they should carefully consider the above 
factors to increase the probability of their 
Success. 
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RELATIONSHIP OF INTELLIGENCE, VISUAL-MOTOR 
SKILLS, AND PSYCHOLINGUISTIC ABILITIES WITH 


BYRON EGELAND AND 


ACHIEVEMENT IN THE THIRD, FOURTH, 
AND FIFTH GRADES: 
A FOLLOW-UP STUDY 


OWEN B. DUFFY,IV, THEODORE N. CLAIR 


University of Iowa 


Syracuse University 


The relationships of intelligence, psycholinguistie abilities, and 
visual-motor skills with achievement were investigated for third-, 
fourth-, and fifth-grade pupils, using the Vocabulary, Reading, and 
Arithmetic subtests of the Iowa Test of Basic Skills as measures of 
achievement. The statistical procedure utilized for this study was 
the *stepregn" method of multiple-regression analysis. An evalua- 
tion of the data revealed that psycholinguistic ability was the only 
independent variable contributing significantly to the multiple cor- 
relations computed for third-grade achievement. The fourth- and 
fifth-grade findings were entirely different. Here, except for the 
achievement criterion, vocabulary, in the fourth grade, visual-motor 
Skills was found to add significantly to the multiple correlations 
computed for each of the grades. 


MARIO DINELLO 


Texas Women’s University 


This investigation is concerned with the 
follow-up of Egeland’s (1966) study which 
attempted a systematic inquiry to discover 
the combination of variables that seemed 
potentially useful in predicting academic 
success in the first grade. Psycholinguistic 
abilities, visual-motor skills, and intelli- 
gence were the psychological factors chosen 
as predictive variables. Whereas Egeland 
sought to determine the relationships of 
these variables to academic performance in 
the areas of first-grade reading and arith- 
metic, the present authors attempted to 
establish if these same relationships now 
exist for the children as they progressed 
through the third, fourth, and fifth grades. 


Meton 
Subjects 


The pupils from Egeland’s sample as they com- 
pleted third, fourth, and fifth grades in the Iowa 


City publie school district comprised the sample 
of the present study. The respective sample sizes 
at the third-, fourth-, and fifth-grade levels were 
64, 67, and 57, respectively. The attrition rate 5 
subjects from the original first-grade sample ol 
125 is high. To a large extent, this is due to the 
mobility of children in a university community. 
The increase in the number of subjects from 64 yP 
the third to 67 in fourth grade was the result 1 
three members being absent at the time of the 
third-grade test administration. 


Tests 


The measuring tools used to assess intelligent 
visual-motor development, and psycholinguis 
abilities were the Wechsler Intelligence Scale fo 
Children (WISC), the Bender Visusl ein 
Gestalt Test for Children, and the Illinois ds 
of Psycholinguistic Abilities (ITPA), respective y; 
The criteria for achievement were scores on in 
on the Iowa Test of Basic Skills. The Det 
obtained on the Vocabulary and Reading SU «i 
of the Iowa Test constituted the measure of E. t 
ing ability, while the score on the Arithmetit 8 rA 
test represented the degree of mathematical P 
ficiency. 
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Data Analysis 

The correlations between the independent var- 
iables—Bender Gestalt Koppitz scores, ITPA, and 
WISC Full Scale IQ scores—and the criterion of 
achievement were ascertained, and the best com- 
bination of independent variables for predicting 
academic success was determined. The Statisti- 
cal procedure utilized for this interpretation was 
the stepregn method (Draper & Smith, 1966). 

Six multiple-regression analyses were per- 
formed for each of the three grade. The first three 
regression equations for each grade determined 
the relationships of the independent variables, 
WISC Full Scale IQs, Bender Gestalt Koppitz 
Scores, and total language score from the ITPA, 
with each of the three areas of achievement— vo- 
cabulary, reading, and arithmetic concepts and 
skills. The remaining three multiple-regression 
analyses were concerned with the same areas of 
achievement, with the difference being that ITPA 
subtest scores were used ; these test scores were 
part of the independent variables. A 5% signifi- 
cance level was used throughout the study. 


Resuits AND Discussion 


Table 1 reveals that the Wechsler Full 
Scale IQ, Bender Gestalt Koppitz scores, and 
ITPA composite scores correlated signifi- 
cantly with all the achievement criteria for 
the three grades under investigation. The 
Tesults pertaining to the relationship of the 
one remaining independent variable, ITPA 
subtests, to achievement shows Visual- 
Motor Association is the only subtest which 
significantly correlated with all achievement 
criteria across primary grades. The range of 
significant correlations for the Visual-Motor 
Association subtest, as presented in Table 2, 
extend from a low of .26 to a high of .48. 
Other subtests of the ITPA were signifi- 
cantly associated with achievement on a less 
global scale. An inspection of Table 2 shows 
the subtests at the Automatic-Sequential 
and Auditory Decoding level were found to 
be significantly related to achievement in the 
third and fourth grades, respectively, but 
not in the fifth. At the Association level all 
these subtests were significantly correlated 
with achievement in fourth and fifth grades. 

The Association subtests measure the 
child’s ability to relate visual or auditory 
Symbols in a meaningful way while the 
Automatic-Sequential subtests assess the 
child’s facility to deal with nonmeaningful 
Uses of symbols, principally their long-term 
retention and the short-term memory of 


TABLE 1 
CORRELATIONS OF THE INDEPENDENT VARIABLES, 
WISC Furt Scire IQ, Benner GzsTALT 
Korrrrz Scores, AND Toran LANGUAGE 
Scores on Tae ITPA wrrH ACHIEVEMENT 
A8 MEASURED BY THE Iowa Test oF Basic 
SKILLS across PRIMARY GRADES 


Vocab- | Read- | Arith- 


Variable 


ulary ing metic 
Lors eee 
WISC Full Scale IQ 
Grade 3 .28 .26 .95 
Grade 4 48 .38 .91 
Grade 5 .89 38 49 
Bender Gestalt scores 
Grade 3 37 .32 -30 
Grade 4 45 .51 .89 
Grade 5 43 46 Al 
ITPA composite score 
Grade 3 .39 -38 49 
Grade 4 .36 E -30 
Grade 5 .21 .29 +383 


Note.—All correlations are significant at the 
-05 level. For Grade 3, n = 64; Grade 4, n = 67; 
and for Grade 5, n = 57. 


symbol sequences. The subtest Auditory 
Decoding relates to the child's ability to 
comprehend the spoken word. It is suggested 
that suecess in achievement in the third, 
fourth, and fifth grades is not a completely 
static situation with respect to linguistic 
skills. Tt is true that the ability to relate 
meaningful visual symbols is an important 
factor in achievement across primary grades. 
However, it is apparent that each. grade 
requires its own unique combination of 
linguistic skills in order for an individual to 
be successful in achieving. 

A further study of Table 2 reveals that 
subtests of the ITPA, Motor Encoding and 
Visual Decoding, were found not to be 
significantly related to all achievement 
eriteria in each of the grades under inves- 
tigation. With respect to the Motor Encod- 
ing subtest there is the suggestion that the 
ability to express one's ideas in the form of 
gesture is not, according to the results of the 
present study, related to academic achieve- 
ment in either the third, fourth, or fifth 
grades. The lack of significance is possibly 
due to the construction of the Iowa Test of 
Basic Skills. The responses required on the 
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TABLE 2 


CORRELATION or THE ITPA SUBTEST SCORES WITH ACHIEVEMENT AS MEASURED BY THE Iowa TrsT 
or Basic SKILLS ACROSS PRIMARY GRADES 


Vocabulary Reading Arithmetic 
ITPA subtests Grade 
3 4 5 3 4 5 3 4 5 
Auditory vocal automatic .88* | .49* | .25* .84* | .45* 17- |. .84* | .32* 16 
Visual decoding .02 |17 | .01 | —.01 |.08 | —.02 | .09 21* 04 
Motor encoding .06 | .04 08 |—.08 |.02 | —.07 15 01 | —.08 
Auditory vocal association .26* | .48* | .24* .25* | .97* ,9* 18 | .26* 82" 
Visual motor sequencing .85* | .82* | .12 .46* | .35* .84* | .45* | .33* .40* 
Vocal encoding 17 | .16 | .14 11 28* 15 29* | .17 08 
Auditory vocal sequencing .82* | .44* | .19 .42* | .46* .27* | .39* | .37* .99* 
Visual motor association .40* | .42* | .26* .89* | .46* .44* | .40* | .45* .48* 
Auditory decoding .29* | .37* | .08 .25* | .27* 05 | .32* | .26* 01 


* Correlation is significant at the .05 level. 


Iowa Test measuring reading and arithmetic 
achievement involve a multiple-choice re- 
sponse that would not require the child to 
express overtly his ideas in terms of gestures. 
Perhaps if the criteria of academic success 
required the child to express himself freely 
and if the criteria were not quite as struc- 
tured as the Iowa Test of Basic Skills, the 
Motor Encoding subtest might have pre- 
dicted achievement to a significant degree 
across primary grades. 

The lack of a significant relationship 
between the Visual Decoding subtest and 
achievement across primary grades is indeed 
surprising and contrary to what would be 
expected. Visual Decoding assesses the 
ability of a child to comprehend pictures and 
written words which, superficially, appears 
to be a skill necessary for achievement on 
the reading and word knowledge tests. A 
possible explanation for the lack of a re- 
lationship between the Visual Decoding 
subtest. and achievement across primary 
grades may be the somewhat low reliability 
coefficient for the Visual Decoding subtest 
(r = .45) when given to children 6 years of 
age. 


In addition to trying to establish the 
relationship of each of the independent 
variables to achievement, an attempt was 
made to select the best combination of 
independent variables for predicting achieve- 
ment across primary grades using a multiple- 
regression analysis. Table 3 contains the 
final multiple correlations and significant 
beta weights of the third, fourth, and fifth 
grades for the three independent variables: 
WISC Full Scale IQ, Bender Gestalt 
Koppitz score, and ITPA composite score. 

In exploring the relationship between the 
independent and dependent variables in the 
third grade, it is clear from an examination 
of Table 3 that psycholinguistic abilities, aS 
measured by the composite score on the 
ITPA, is the only independent variable 
significantly contributing to the multiple 
correlation computed for third-grade 
achievement. The other two independent 
variables, IQ and Koppitz scores On i 
Bender Visual Motor Gestalt, do not 8d 
appreciably to the multiple correlation. 
When the subtest scores on the ITPA 87e 
substituted for the composite score 12 A 
multiple-correlation analysis, IQ 4 
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Koppitz scores again fail to add significant; 
beta F values as shown from the data in 
Table 4. The subtest scores, however, do not 
reveal any consistent trend in the multiple- 
regression analyses involving the three 
criteria measures for third-grade achieve- 
ment. Table 4 reveals that the Visual-Motor 
Association and Auditory Vocal Automatic 
subtests are the principal factors contri- 
buting significant beta F values in the 
prediction of the vocabulary score on the 
Towa Test. Auditory Vocal Automatic and 
Visual-Motor Sequencing both significantly 
add to the multiple-regression analysis 
involving prediction of the reading test of 
the Iowa Test of Basic Skills, while Auditory 
Decoding and Visual Motor Sequencing are 
significant, factors in the prediction of arith- 
metic. The absence of measured intelligence 
in each of the final multiple correlations 
computed is quite surprising. 

Prediction of fourth-grade achievement 
Teveals an entirely different picture. From 
an examination of Table 3, it is clear that 
the visual-motor score is the only variable 
which contributes significantly to the multi- 
Ple correlations except for the achievement 
criterion vocabulary. Here intelligence and 
visual-motor functioning are found to have 


TABLE 3 
Finan MurTIPLE CORRELATIONS AND SIGNIFICANT 
Bura WrranTs or THE Tuirp, FOURTH, AND 
Firma GRADES ror THE INDEPENDENT 
VanrABLuES: WISC FULL SCALE IQ, BENDER 
Gustaur, anv ITPA COMPOSITE SCORE 


Achievement Significant beta weights 
aion: Grade 
Vocabulary subtest 
-39 3 ITPA composite 
-54 4 IQ, Bender Gestalt 
43 5 Bender Gestalt 
Reading subtest 
38 3 ITPA composite 
5 4 Bender Gestalt 
46 5 Bender Gestalt 
Arithmetic subtest 
fd 3 ITPA composite 
S 4 Bender Gestalt. 
in 5 Bender Gestalt 


TABLE 4 
FINAL MULTIPLE CORRELATIONS AND SIGNIFICANT 
Bera WercaTs OF THIRD, FOURTH, AND FIFTH 
GRADES USING THE INDEPENDENT VARIABLES: ` 
WISC FuLL Scare IQ, BENDER GESTALT, 
AND ITPA Susresrs 


Achieve- Significant beta weights 
criterion:| Grade 
Final R Vocabulary subtest 
49 3 | ITPA Subtests, 4, 11 
.62 4 | IQ, ITPA Subtests, 4, 11 
43 5 | Bender Gestalt 
Reading subtest 
.55 3 | ITPA Subtests, 4, 8 
.64 4 Bender Gestalt, ITPA Subtests, 
7,8 
.55 5 Bender Gestalt, ITPA Subtest, 7 
Arithmetic subtest 
.55 3 | ITPA Subtests, 8, 12 
.50 4 | ITPA Subtests, 10, 11 
.58 5 Bender Gestalt, ITPA Subtests, 


8,11 


Note.—Subtest numbers: 4 = Auditory-vocal 
automatie, 7 — Auditory-vocal association, 8 — 
Visual-motor sequencing, 10 = Auditory-vocal 
sequencing, 11 = Visual motor association, and 


12 = Auditory decoding. 


significant beta F values. Once again, the 
minor role of measured intelligence in the 
predietion of achievement is unexpected. 
'The emergence of the Bender Visual Motor 
Gestalt Test for Children as an important 
variable related to achievement receives 
some support from the literature. Koppitz 
(1964) points out the usefulness of the 
Bender Visual Motor Gestalt Test in 
screening and predicting reading and arith- 
metic achievement in lower elementary 
grades. She cites correlations between 
that test and achievement as typically in 
the from .50 to .70 range. UM 

Replacement of the ITPA composite 
score in the fourth grade with ITPA subtests 
values in the multiple-regression analysis 
changes the prediction paradigm as Table 4 
reveals. Fourth-grade achievement as as- 
sessed by the Vocabulary Test of the Iowa 
Test is significantly associated with meas- 
ured intelligence and two subtests of the 
ITPA, Auditory Vocal Automatic and 
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Visual-Motor Association.  Visual-motor 
functioning and the Auditory Vocal Asso- 
ciation and Visual-Motor Sequencing Sub- 
tests of the ITPA are found to contribute 
significant beta F values involving prediction 
of the Reading test of the Iowa Test of 
Basic Skills. For the relationship between 
the dependent variable, arithmetic achieve- 
ment, and the independent factors, two 
subtests of the ITPA (Auditory Vocal 
Sequencing and Visual-Motor Association) 
emerge as significant factors associated with 
fourth-grade arithmetic achievement. 

The relationship between the independent 
and dependent variables in the fifth grade 
reveals, under investigation, a change from 
either the third or the fourth grade. From 
Table 3 it is apparent that the visual-motor 
index is the one independent variable con- 
tributing in a significant way to the mul- 
tiple correlations for all three achievement 
areas explored. However, substitution of the 
subtests of the ITPA for its composite score 
in the multiple-regression analysis results in 
a different paradigm for prediction as a 
review of Table 4 will show. For example, 
visual-motor functioning is the only in- 
dependent variable contributing a significant 
beta weight in the multiple-regression 
analysis involving the prediction of fifth 
grade achievement as measured by the 
Vocabulary test of the Iowa Test of Basic 
Skills. The Reading test also is associated 
significantly with visual-motor functioning; 
however, the ITPA Auditory Vocal Asso- 
ciation Subtest also shares in the prediction 
of this dependent variable as Table 4 reveals. 
For fifth-grade arithmetic achievement the 
picture for prediction changes altogether. 
Here the ITPA subtests, Visual Motor 
Sequencing and Visual-Motor Association 
emerge as the significant variables in the 
multiple-regression analysis, 

It is obvious from the material considered 
and the data presented in this investigation 
that the forecasting of academic success in 
the elementary grades involves a variety of 
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factors. Apparently, adequate functioning in 
each grade calls not for stability of skills 
from one grade to another, if success is to be 
obtained, but rather demands are made 
upon the subject to be flexible. What cog- 
nitive skills are important in the first grade 
are not necessarily crucial for adequate 
functioning in later grades. What accounts 
for the interchanges of variables found to be 
important for success across primary grades 
is open to conjecture. The sex of the sub- 
jects, the developmental nature of the 
subjects under investigation, the uniqueness 
of the curriculum of each grade, the possible 
bias inherent in the makeup of the sample, 
and the measuring instrument itself (the 
Towa Test) are all factors, in addition to 
those investigated, that could individually 
or collectively determine the degree of 
relationships that were found in this study. 
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LETTER SCANNING RATE FOR GOOD AND POOR READERS 
IN GRADES TWO AND SIX: 


LEONARD KATZ! anD DAVID A. WICKLUND 


University of Connecticut 


The ability to scan visually a row of letters for the presence or 
absence of a predetermined key letter was tested for good and poor 
readers in Grades 2 and 6. Target rows consisted of either one, two, 
or four letters. Latency of response was greater for second graders 
than for sixth graders. In addition, scan rate was slower for second 
graders. No differences due to reader ability were found. The results 
agreed with previous findings that good and poor readers did not dif- 
fer in mean scan rate when words instead of letters had been used as 
stimuli. However, the previous suggestion that differences between 
good and poor readers existed in the efficiency of response selection 
was, in the light of the present data, found to be questionable. 


Recently, the present authors (1971b) 
studied the word scanning rate of fifth 
graders classified as good or as poor readers. 
On each trial, subjects were required to sean 
rapidly a target row of words for the presence 
or absence of a predetermined key word; the 
key word changed from trial to trial. Target 
Tows were either two or three words in 
length. Poor readers were about 250 milli- 
seconds slower overall than good readers. 
However, the increase in reaction time for 
three word targets was the same (about 100 
msec.) for both good and poor readers. From 
these results, we inferred that the difference 
between the two kinds of readers was not 
In the ability to scan, encode, and match 
Words; the differences must occur elsewhere: 
in the speed of orientation to the target, in 
Tesponse selection (deciding on what re- 
Sponse to make after a match or no-match 
18 found), or in the motor portion of the 
Tesponse. 

_ 
! This research was supported by National 
ae of Child Health and pe Develop- 
rant HD-03932-02 to the authors. Susan 
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ees anchester, Connecticut, for his kin 

: ation. 
esr Íor reprints should be sent to 
Yersity of Ru Department of Psychology, Uni- 

onnecticut, Storrs, Connecticut 06268. 


Another experiment (Katz & Wicklund, 
1971a) demonstrated that no difference 
existed between good and poor readers in a 
simple reaction time situation with a con- 
stant foreperiod. These data suggested the 
differences between good and poor readers 
did not lie in orientation to the target or in 
the motor portion of the response. This left 
the mechanism of response selection as being 
responsible for the difference. 

The present experiment was designed to 
explore the generality of the previous find- 
ings on scanning, The age range was in- 
creased to include second- and sixth-grade 
subjects. The range of target lengths was 
increased to include one, two, and four 
stimuli. The mode of responding was 
changed from oral to manual, and last, the 
stimuli used in the present study were letters 
instead of words. Letters were chosen be- 
cause they are more strongly overlearned 
for both second- and sixth-grade subjects 
than words. By eliminating possible idio- 
syncratic and group differences between 
subjects due to the encoding of words, letters 
make for a more precise test of group differ- 
ences in high-speed visual scanning. * 

On the basis of our previous experiment on 
word scanning rate, it was expected that 
scanning rate should be equivalent for good 
and poor subjects within each grade. How- 
ever, poor subjects should be slower in 
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overall latency. Results of both of the previ- 
ous studies (Katz & Wicklund, 1971a, 1971b) 
suggested that this difference would be 
caused by a poorer response selection ability. 
If anything, the response selection problem 
should be more difficult in the present study. 
The oral response “yes” or “no” used in the 
word scanning rate experiment should follow 
naturally from the outcome of the process of 
matching key word to target words. On the 
other hand, in the present study, subjects 
must learn a moderately difficult corre- 
spondence between the outcome of the 
matching process and the appropriate 
manual button press for “yes” or “no.” 

With regard to grade differences and 
based on the simple reaction time experi- 
ment (Katz & Wicklund, 1971a), it was 
likely that second graders would be slower 
overall than sixth graders. In addition, if 
Second graders scanned at the same rate as 
sixth graders, the relation of the second to 
the sixth grade would be analogous to the 
poor reader-good reader relation found at a 
single grade in Katz and Wicklund (1971b). 
This would suggest that poor readers in a 
single grade are simply at an earlier develop- 
mental level than the good readers; develop- 
ment here might consist of increased effi- 
ciency in response selection. It should be 
recalled. that development of the motor 
portion of the response can be ruled out as a 
candidate for good reader-poor reader 
differences, because of the results of Katz 
& Wicklund (19718) that showed no good 
reader-poor reader difference in simple 
reaction time. 


Merrnop 


Stimuli 


Stimuli were constructed from the set of 10 
lowercase letters: b, d, e, f, g, h, 1, m, n, and t. For 
each trial, two slides were made; the first slide 
contained a single letter (the key letter) and the 
second slide contained a row of either one, two, or 
four letters (the target). In each of three sets of 15 
trials each, the first 5 trials contained targets of 
one letter each, the next 5 trials contained targets 
of two letters each, and the last 5, targets of four 
letters each, The first, sixth, and eleventh trials 
were dummy trials (these data were not analyzed) 
and were included to give practice on the particu- 
lar Pap. of target used for the four subsequent 
trials. 


LEONARD KATZ AND DAVID A. WICKLUND 


Excluding the dummy trials, each key letter 
occurred an equal number of times in each set and 
an equal number of times, across sets, for each 
target length. Within each set of 12 actual trials, 
2 trials at each length were positive trials (i.e., the 
target slide contained the key letter) and 2 were 
negative trials. The order of positive and negative 
trials was random within the constraints men- 
tioned. The position of the key letter on positive 
target slides (e.g., between the first and second 
position on a two-letter target) was balanced for 
two- and four-letter targets. 


Subjects 

Subjects in each grade were tentatively divided 
into groups of good and poor readers based on the 
reading test scores which were available, Ginn 
scores for the second grade and Iowa scores for the 
sixth grade. Following each experimental session, 
the Wide Range Achievement Test reading sec- 
tion was administered to the subject and a final 
classification was made on the basis of the 
Wide Range scores. A few subjects, whose Iowa 
or Ginn percentile scores were markedly differ- 
ent from their Wide Range scores were excluded 
from the data analysis and were replaced with 
new subjects. Fifteen good readers (G6) and 15 
poor readers (P6) in Grade 6 and 12 good (G2) and 
12 poor readers (P2) in Grade 2 were analyzed. 
The final range of Wide Range Achievement Test 
percentiles, together with medians (in parenthe- 
Ses) was as follows: G6, 61-91 (81); P6, 2-47 
(18); G2, 55-97 (75); P2, 8-53 (25). 


Apparatus 


A Kodak Carousel slide projector was used in 
conjunction with Hunter timers, a Hunter Model 
1520 Millisecond clock, a two-key telegraph board, 
and response indicator lights. 


Procedure 


At the beginning of an experimental session, the 
experimenter ascertained which hand was subject 
dominant hand and placed a card reading “Yes 
next to the telegraph key under the dominant 
hand and a card reading “No” next to the other 
key. The subject was instructed to say the letter 
aloud on the first slide of each trial and, when the 
second slide appeared, to make the appropriate 
manual response in answer to the implicit question, 
“Ts the key letter in the second slide?” The subject 
then received five practice trials and was cautione 
to respond faster whenever his latency rose above 
1.7 seconds. ts 

Then, the subject went through three 8e 
(blocks) of 15 trials each with a pause of abouti 
minute between sets, while the experimenter 
changed the slide tray. The order of presentatio ki 
of sets was counterbalanced in a Latin square, i 
that each stimulus set occurred equally often ra 
each block. Within each grade, an equal number 0 
Subjects went through each set order. 
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At the beginning of each trial, the experimenter 
initiated the first slide which was presented for 3 
seconds and was followed by the target slide after 
a 940-millisecond slide-change interval. The sec- 
ond slide remained on for 3 seconds. 


RESULTS AND DISCUSSION 


Errors 


Errors were few and unsystematic. The 
mean error rates per subject for the 36 trials 
were as follows: good sixth-grade readers, 
2.7; poor sixth-grade readers, 2.5; good 
second-grade readers, 2.2; poor second-grade 
readers, 2.0. 


Analyses of Variance 


For each subject, an average latency was 
computed for each combination of response 
type (yes or no), scan length (1, 2, or 4), and 
block (1, 2, 3). Each average was based on 
two latencies unless the subject had made 
an error, in which case the remaining latency 
was used. An analysis of variance was per- 
formed on these correct response data within 
grade (2 and 6) and reader ability (good 
and poor). 

There were strongly significant main 
effects (p < .001) for grade (F = 26.8, df = 
1/50), response type (F = 35.6, df = 1/50), 
and scan length (F = 62.0, df = 2/100). 
Latencies for Grade 2 were slower than those 
for Grade 6, negative responses were gen- 
erally slower than positive responses and 
latency increased (with an exception noted 
below) as scan length increased. Scan rates 
(.e., latency as a function of scan length) 
appeared curvilinear, decelerating with 
Increasing scan length. Neither reader 
ability nor any of its interactions were 
Significant and, in fact, the reader ability 
Mean square was small, 

The effect of blocks was significant (F = 
8.23, df = 2/100, p < .01), as were the 
Blocks X Response Type X Length inter- 
action (F = 2,82, df = 4/200, p < .05), and 
Blocks x Response Type X Length X 
Grade Effect (P = 3.58, df = 4/200, p < 
01), ‘These effects appeared to be due to 
Practice. Grade 6 decreased their overall 
latency from Block 1 to Block 3 but retained 

© same scan rate; Grade 2 decreased their 
Overall latency and, in addition, decreased 
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Fra. 1. Positive and negative response latencies 
for Grades 2 and 6 as a function of target size. 


their scan rate. Moreover, the negative 
response latencies for Grade 2 decreased 
more over blocks than the positive response 
latencies. In Blocks 1 and 2, negative re- 
sponse latencies for Grade 2 subjects were 
faster when scan length was 2 than when 
length equaled 1. This result did not occur 
in Block 3. 

In order to inspect data without practice 
effects, the data from Block 3 alone were 
subjected to an analysis of variance. The 
effect of grade was significant (F = 60.4, 
df = 1/50, p < .001). Neither reader ability 
nor any of its interactions was significant. 
Response type was significant (F = 14.1, 
df = 1/50, p < .001) as was Response Type 
X Grade (F = 4.9, df = 1/50, p < .05). 
Inspection of Figure 1 suggests that negative 
responses were slower than positive responses 
for Grade 2 but not for Grade 6. Scan length 
was significant (F = 26.5, df = 2/100, p < 
.001), as was Scan Length X Grade (F = 
4.98, df = 2/100, p < .01). Figure 1 suggests 
that the scan rate for Grade 2 was generally 
slower than for Grade 6. 


Type of Scanning 

Although no reader ability effects were 
observed, differences due to grade are in- 
formative with regard to ability. The data 
for Grade 6 suggest that subjects utilized a 
high-speed scan based on visual information. 
By inspection of Figure 1, the data are 
reasonably linear and there appears to be no 
important differences between positive and 
negative latencies; none were statistically 
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significant, This suggests that scanning was 
exhaustive (as opposed to self-terminating) 
for these subjects; that is, subjects scanned 
the entire target row of letters, even on 
positive trials where scanning continued 
after a matching letter had been scanned. 
Similar results were obtained by Atkinson, 
Holmgren, and Juola (1969) with adult sub- 
jects. The scan rate for Grade 6 averaged 
over response type and based on a least- 
squares linear fit is 18.2 letters/second 
(slope equals 55 milliseconds). This rate is 
faster than inner or covert speech (Stern- 
berg, 1969) and faster than the preferred rate 
for compressed speech found for adults (Orr, 
Friedman, & Williams, 1965); it suggests 
that subjects did not name the target stimuli 
as they scanned them but matched on the 
physical visual characteristics of the stimuli. 

For Grade 2, the results are equivocal with 
regard to exhaustive versus self-terminating 
scanning but, nevertheless, suggest high- 
speed scanning. Inspection of Figure 1 and 
the analysis of variance indicates that nega- 
tive responses are slower than positive re- 
sponses. Exhaustive scanning may obtain 
even if negative responses are slower as long 
as the positive and negative latency curves 
remain parallel as scan length increases. If a 
subject does not scan exhaustively, but 
instead terminates when he reaches the cor- 
rect letter in a target, on the average, he will 
sean only half as long on positive trials, since 
the positive letter appeared equally often in 
each serial position and had a mean position 
equal to half the scan length. In the latter 
case, the slope for positive responses should 
be half that for negative responses, The 
average slopes for positive responses and 
negative responses, respectively, are 81 and 
147 milliseconds, a ratio of .55. These differ- 
ences in slope, however, are not significant; 
the F ratio for the Grade X Response Type 
X Scan Length was F = 1.91, df = 2/100, 
p < .20. Thus, there is no clear evidence in 
support of either scanning strategy. 

If self-terminating scanning is the case, 
the scan rate is the average of the negative 
response slope plus twice the positive re- 
sponse slope, that is, 155 milliseconds/letter 
or 6.5 letters/second. On the other hand, if 
exhaustive scanning obtains, the scan rate is 
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simply the average of the negative and 
positive response function slopes, that is, 114 
milliseconds/letter or 8.8 letters/second. The 
slowest scanning rate estimated can be com- 
puted from Target Lengths 1 and 2 alone, 
under the assumption of a self-terminating 
scanning strategy. This estimate produced a 
rate of 3.4 letters/second. If subjects utilized 
inner speech to scan the letters, they would 
be at about the rate (words/seconds) found 
by Orr et al. (1965) for college-age subjects. 
(See also Sternberg, 1969.) The four-letter 
targets would, of course, have to be processed | 
at a rate much faster than this. Thus, the 
evidence suggests that neither Grade 2 sub- 
jects nor Grade 6 subjects used inner speech 
in the comparison process. 


Reader Ability 


Most importantly, no differences in scan- 
ning letters were observed between good and 
poor readers, a result contrary to that ob- 
tained in the earlier word scanning experi- 
ment with fifth graders (Katz & Wicklund, 
1971b). It can not be argued that reader 
ability differences were obscured in. the 
present experiment because of a "ceiling 
effect” of slow responding: latencies appear 
to be at least as fast as those in Katz & 
Wicklund (1971b). For example, the scan 
rates for words found in the earlier experi- 
ment were about 100 milliseconds/word, à 
figure comparable to the sixth-grade rates 
shown in Figure 1. Moreover, the latency for 
two words in the earlier study was 1,240 
milliseconds slower than the latency on two 
letters of 715 milliseconds found in the 
present study for sixth graders. Even allow- 
ing for the difference in the response mode 
between the two experiments, it is unlikely 
that the vocal motor response component 
alone should be as much as half a second 
slower than the motor component for manual 
responding. Therefore, it is likely that per- 
formance for sixth graders, at least, was not 
inferior to the performance of fifth graders 
in the earlier study; the lack of a reader 
ability effect can not be easily attributed to 8 
ceiling effect. In addition, the mean oes! 
for the reader ability term in the ie 
analysis of variance was small (F = 47, df 
= 1/50), while effects of the other variables 
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were highly significant and appear, on in- 
spection to Figure 1, to be strong. 

If one is able to consider the Grade 6 
subjects in the present study and Grade 5 
subjects in the previous Katz and Wicklund 
(1971b) study to beroughly equivalentforthe 
purpose of comparing the two experiments, 
there are two important differences between 
the two experiments that should be pointed 
out. They are the mode of responding and 
the nature of the stimuli. As we have 
mentioned above, the use of manual respond- 
ing was designed to emphasize hypothesized 
differences between good and poor readers in 


' response selection ability. Because no such 


differences were found, we conclude that the 
response selection hypothesis is not correct. 

There are at least two ways in which the 
stimulus differences between the present 
experiment and Katz and Wicklund (1971b) 
study can account for the different results. 
First, even a scan length of two words contains 
more letters than a scan of four letters; it 
may be that a length of four letters was 
within the span of apprehension of all sub- 
jects, good and poor readers alike, within 
each grade, and differences will not be 
found until we probe for different appre- 
hension spans between good and poor 
readers, 

_A second explanation follows from the 
differences in the differential familiarity to 
subjects of words and letters. The alphabet 
is a very well learned set for both good and 
poor readers, but words are less familiar and 
a given word may be differentially familiar 
to good and poor readers. If at some point in 
Tesponding to the target, a subject must 
retrieve the name of the key word, and if 
good readers retrieve faster than poor readers 
(and older subjects faster than younger 
Subjects), differences in overall latency would 
Tesult. Because in the Katz and Wicklund 
(1971b) study the intercepts of the latency 
functions, but not the slopes, differed be- 
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tween good and poor readers, the memory 
retrieval could not have occurred during 
scanning. The most likely point for retrieval 
would be just prior to the beginning of the 
scanning and comparison process. 

In an experiment on memory scanning, 
Katz and Wicklund (1969) presented fifth- 
grade subjects, on each trial, with a memory 
set of two, three, or five words, followed by a 
single target word. Although the usual result 
of a linear increase in reaction time with 
increases in the memory set did not occur, a 
significant overall difference of 250 milli- 
seconds between good and poor readers was 
obtained. The faster mean reaction time for 
good readers is of the same order of magni- 
tude as the difference found in the experi- 
ment on visual scanning for words (Katz & 
Wicklund, 1971b), and offers additional evi- 
dence in support of a memory retrieval 
difference hypothesis. Future research must 
probe for memory retrieval differences be- 
tween good and poor readers; our research 
suggests that there are no reader ability 
differences in scan rate. 
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EFFECT OF TELEVISED SIMULATED INSTRUCTION 
ON SUBSEQUENT TEACHING 


LARRY C. JENSEN anp JON I. YOUNG 
Brigham Young University 


This study was conducted to determine whether simulated instruc- 
tion improved subsequent teaching. Thirty-seven subjects were se- 
lected from a teacher-training program. Half of the subjects were 
randomly assigned to simulation; the other half were controls. Follow- 
ing the simulation, all subjects engaged in 8 weeks of student teachi ng. 
Performance was evaluated on three separate occasions during the 
student teaching. A Teacher Performance Evaluation Scale was factor 
analyzed and six specific performance factors were identified: (a) per- 
sonality traits, (b) warmth of teacher behavior, (c) general class- 
room atmosphere, (d) lesson usefulness, (e) teacher interest in pupils, 
and (f) teacher interest in student achievement. Subjects receiving 


microteaching received higher ratings on the first five factors. 


Simulation can be defined as an instruc- 
tional problem-solving activity which closely 
resembles a situation in life. Although there 
is no specific way to employ simulation in 
learning, it generally involves three basic 
steps: (1) action on the part of the learner, 
(2) feedback on success or failure, and (3) a 
concise summary of response and appropri- 
ate solution to the problem (Edinger, 1968). 
With the invention of videotape recording 
devices, simulation has acquired a new 
popularity in psychology and education. 
Videotape recording uses a closed-circuit 
television camera and television monitor to 
record activity of a person or group of per- 
sons involved in a learning situation. The 
equipment is portable enough to be used al- 
most anywhere, and therefore therapists, 
ours ih liora are able to examine 

. their own vior during training. Un- 
fortunately, most of the literature is incon- 
clusive or contradietory and reports good 
feelings rather than concrete conclusions 
(Bjerstedt, 1968; Borg, 1969; Foster, 1967; 
Gibson, 1968; Kallenback & Gall, 1969; 
Lockhart, 1968; Lundy & Hale, 1967). It 
appears that microteaching as a simulation 
experience in teaching has not been ade- 
quately tested with systematic research. 
The variables of models, supervisors, rein- 
forcement, feedback, and focus of problems 


have not been appropriately included in 
either discussion or research, (Politzer, 
1969). In addition, it has not been deter- 
mined exactly what aspect of teaching is in- 
fluenced or if a permanent change occurs. 

Lundy and Hale (1967) reported that 
microteaching prepares a trainee to teach 
because the conditions of the two experiences 
are closely associated. They conclude that 
since anxiety is present in both situations, 
the use of microteaching will decrease 
anxious responding and this makes student 
teaching a more meaningful experience. 
Travers, Rabinowitz, and Nemovicher 
(1952) alluded to this same idea. They Te- 
ported that because anxiety hinders learning 
it handicaps student teachers by causing 
them to be discouraged, and disorganized, 
thus leading to poorer performance. They 
concluded that student teaching is the type 
of situation in which anxiety develops an 
it would be helpful, especially at first, if the 
situation could be made more predictable to 
reduce the anxiety. . 

It was hypothesized that since micro- 
teaching is a very similar experience to 
actual teaching, it should improve subse- 
quent teaching, especially at the beginning 
of the experience, because it reduces anxiety 
and increases the predictability of the sub- 
sequent teaching. The student should be 
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more confident and less anxious about his 
ability to teach after mieroteaching. In- 
creased self-confidence should logically be 
visible in such specific behaviors as poise, 
speech, and other observable indicators of 
nervousness. Since those students having 
simulation experience would theoretically re- 
quire less time for adjustment to subsequent 
teaching, their experience should be more 
profitable, and they should increase in 
proficiency faster than those not having had 
microteaching. 

While previous investigations about the 
effects of microteaching have only rarely 
used control groups, none have examined 
separably the specific areas of the teacher's 
behavior. In addition, previous experiments 
have not made repeated observations to 
determine the longitudinal effect of micro- 
teaching. This experiment was designed to 
help meet the need for more controlled ex- 
periments on the effects of microteaching 
and to investigate different areas of teacher 
behavior over more than one time period. 


METHOD 


Subjects 


Subjects were enrolled in Brigham Young Uni- 
versity’s Experimental Teacher Education Pro- 
gram, called I-STEP. All subjects were complet- 
Ing their course requirements during the semester. 

Program requirement, which influenced the 
organization of this experiment, grouped the sub- 
jects into 26 teams of two or three members with 
two subjects unassigned. This was done on the 
basis „of subject-matter expertise and student- 
peaching assignment. Only those teams that would 

teaching cognitively oriented subjects, such as 
Social science, math, English, ete., were used; for 
example, students in physical education, music, or 
Anguages were excluded. This procedure elimi- 
nated all but 13 teams and the two individuals. 

By use of a table of random numbers, six teams 
and one individual were assigned to the experi- 
qu group for a total of 19 subjects. Seven 
coms and one individual were assigned to the 
ieee „group for a total of 19 subjects. The only 
à rietion placed on the assignment was that an 
(qual number of subjects in each group would be 

Mor similar subject matter. 
one ye Subjects were dropped from the study for 
ed of the following reasons: (a) withdrawal from 

ool; (b) lack of available data due to sickness; 


- ° (c) bias by the evaluator. Bias by two raters 


Te assumed because they had been involved in 
instruction of some subjects. Two males and 
females were dropped from the control 
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group, and data were generated from the mean 
scores of the remaining subjects. One group had 
five persons teaching on the junior high level as 
compared to six persons in the other group. All 
others taught senior high school pupils. Mean 
grade point averages were 3.11 and 3.09 in the two 
groups. The experimental group had 6 females 
and 11 males, and the control group had 4 females 
and 8 males. 


Apparatus and Materials 


The equipment consisted of a sound-deadened 
studio equipped with a desk, student chairs, a 
blackboard, and recording material. A video 
camera situated at the back of the room was posi- 
tioned to view both the subjects and the pupils 
during the simulation. A video recorder and tele- 
vision playback unit was located at one side of 
the studio. 

Teaching performance was measured with the 
Teacher Performance Evaluation Scale (Sinha, 
1962). The scale is used to measure specific be- 
haviors of persons engaged in teaching, and it is 
detailed enough to evaluate specific actions. The 
original scale had each item rated by a panel of 
professional educators from various parts of the 
United States. The scale consists of 42 areas, each 
measuring a specific behavior or attitude necessary 
for a teacher to be “‘successful.”’ Each item is rated 
on an 8-point scale ranging from zero to seven. 
The higher the rating obtained, the more appro- 
priate the behavior is considered to be. Because of 
concern by the experimenters about the homo- 
geneity of the items within each of the areas, the 
items were factor analyzed. The ratings obtained 
on each item for all the subjects, participating 
in this experiment were factor analyzed for each 
of the three observations. The results of the factor 
analysis showed all 42 items could be grouped into 
six categories of behavior having an interitem cor- 
relation of .60 or better. The categories were la- 
beled: (1) personality traits, (2) warmth of teacher 
behavior, (2) general classroom atmosphere, @) 
lesson usefulness, (5) teacher interest in pupils, 
and (6) teacher interest in student achievement. 
Categories 1, 2, and 3 were the same as Sinha’s 
scale. 

The following is an example of sample items 
belonging to factors of teacher personality traits, 
teacher warmth, and teacher interest in pupils. 
All are rated on the 8-point scale from zero to 
seven. i 
1. Personality traits . 
Poise, general appearance in the class (smil- 
ing, grouchy, indifferent, tense, etc.) 

2. Warmth of teacher behavior f 

Attitude of teacher to pupils (friendly, sym- 
pathetic, helpful, courteous.) 

3. Teacher interest in pupils 

Flexible plans which allow the teacher to 
stop to discuss topics arising out of the needs of 
the students. 

4. General classroom atmosphere 

Students involved in learning activities. 
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5. Lesson usefulness 
Plan of lesson adapted to the needs of the 
pupils. 
6. Teacher interest in student achievement 
Lessons presented in a way that pupils are 
motivated to learn. 


Design and Procedure 

Within the limits already specified, the subjects 
were randomly assigned to a control group or to 
an experimental group. The experimental group 
consisted of those subjects who received the simu- 
lation activity. The control group subjects re- 
ceived all phases of the teacher training program, 
excluding only the simulation activity. The train- 
ing program was individualized to the extent that 
each subject had certain assignments to accom- 
plish, but each subject worked under his own 
Schedule. Microteaching was merely another as- 
signment for the experimental group and was 
completed when the subjects desired. 

During their participation in the I-STEP pro- 
gram, experimental-group students participated 
in at least three of the simulation activities called 
microteaching. The microteaching model used at 
Brigham Young University involves four steps: 
(a) the subject studies a specific teaching skill; 
(b) he applies the skill to a 7-minute lesson taught 
to four or five peers; (c) the lesson is videotaped 
and then viewed by the student ; and (d) the lesson 
is evaluated and critiqued. 

If necessary, the student reteaches the lesson. 
Peers used as “pupils” 
students of I-STEP. They were used because they 


the peers and supervisor submitted a written 
evaluation to the subject. The written evaluations 
expressed praise and suggestions for improvement. 
The subject collected all evaluations and sub- 
mitted a summary of his lesson to the supervisor 


jects’ supervisors observed a lesson and com, 

the Teacher Performance puaa 
that performance. Individual staff members who 
were unaware of the microteaching phase of the 
experiment were the raters. The raters Participat- 
ing in the evaluation phase were experienced in 
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Fic. 1. Mean ratings for personality traits. 
(Asterisks indicate significant difference between 


groups (p < .05) using the Tukey [a] test.) 


evaluating student teachers, and the same Taters 
evaluated students in both treatment conditions. 
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analysis of variance. The first factor (A) was 
the two groups, (B) was subjects, and (C) 
was the three observations or time. The 
data obtained with the Teacher Performance 
Evaluation Seale were scored into the six 
eategories or subscales resulting from the 
factor analysis and the statistical procedures 
were computed on each of these categories. 

There was at least one significant F ratio 
for each of the factors. The means for each 
group and evaluation period can be ob- 
served in Figures 1 through 6. Only the in- 
teraction F ratio will be presented in the text 
for those factors having both significant 
main and interaction effects. 

On Faetor 1, personality traits, there was 
a significant interaction between groups 
and the time variable (F = 4.11, df = 2/48, 
p < .05). The mean scores for each group 
are presented in Figure 1. A Tukey (a) com- 
parison was used for single comparisons, 
and the asterisk indicates that there was a 
significant difference between the control 
and experimental groups for that time pe- 
riod. In Figure 1, the microteaching group 
ee a superior rating at each observa- 

ion. 

_A measure of teacher warmth and con- 
sideration for students was Factor 2. Al- 
though both groups improved over time 
(p < .001), the experimental group was 
tated significantly better (p « .001) and 
there was a significant interaction effect 
(F — 3.97, df — 2/48, p « .05). The means 
for this measure are presented in Figure 2. 
The mieroteaching group improved more 
rapidly. 

The third factor, that of general classroom 
atmosphere, also had a significant difference 
between groups (p < .05) and across time 
(p< .01). However, the interaction effect 
Was again significant (F = 3.75, df = 2/48, 
P < .05). The experimental group improved 
Dore for the second and third observations, 
38 illustrated in Figure 3. 

Data collected on the subjects’ ability to 
Present meaningful lessons (Factor 4) show 
an improvement over time (p < .05) and a 
Significant group effect (F = 31.17, df = 
H 48, p < .001). The interaction was not 
significant. Means plotted in Figure 4 for 
this measure are in as showing 
that the experimental group had constant 
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MEAN 


TIME 
Fra. 3. Mean ratings for general classroom 
atmosphere. (Asterisks indicate significant differ- 
ence between groups (p < .05) using the Tukey 
[a] test.) 


improvement but the control group did not. 

For teacher interest directed toward 
students (Factor 5), there was a significant 
Time X Treatment interaction (F = 9.70, 
df = 2/48, p « .001). There was a steady 
decrease in performance by the control 
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Fra. 4. Mean ratings for lesson usefulness. (As- 
terisks indicate significant difference between 
groups (p « .05) using the Tukey [a] test.) 
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group while the experimental group im- 
proved (see Figure 5). 

Mean scores for teacher interest in student 
achievement are displayed in Figure 6. 
There was only significant improvement 


tained even though the superiority was 
Sometimes not evident until the third ob- 
servation after approximately 6 weeks of 


The failure to find the microteaching 
group superior on the initial rating for the 
subscales indicates that the effects of micro- 
teaching are not temporary but may even 
increase with a lapse of time. Accordingly, 
it is suggested that the benefits of micro- 
teaching are not due to a temporary reduc- 
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TIME 
Fia. 5. Mean ratings obtained for teacher in- 
toret in dre (Tho asterisk indicates signifi- 
cant difference between groups < .05) using 
the Tukey [a] test.) i 2 
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Fia. 6. Mean ratings obtained for teacher in- 
terest in student achievement. 


tion of anxiety or some other personal at- 
tribute but may indicate that subjects 
learned a basic problem-solving attitude 
during microteaching that is progressively 
reflected in teaching performance. Of course, 
the explanation that microteaching reduces 
anxiety at the beginning of student teaching, 
allowing the student to profit more from the 
teaching experience, is also consistent with 
these findings. The group differences on dif- 
ferent subscales indicates that the hypothe- 
sized problem-solving skill learned during 
microteaching is likely to be task or lesson 
oriented as opposed to pupil centered. The 
mieroteaching subjects initially showed less 
interest in pupils, but this was reversed in 
the ratings during the third period. The only 
Scale where the microteaching group did 
not exceed the control group was in interest 
in student achievement. p” 

While noting that several additional 
variables, such as feedback or subjects 
personality, were not included in this ex- 
periment, it is felt that these data provide 
the clearest most specific description of the 
effect of televised microteaching on subse- 
quent classroom teaching. 
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OBSERVATIONAL LEARNING: 


EFFECTS OF OBSERVED REWARD AND 
RESPONSE PATTERNS! 


YOSSEF GESHURI? 
Institute of Child Behavior and Development, University of Iowa 


The function of observed reward and the influence of the model's pat- 
tern of responding were examined in an observational learning situa- 
tion. Experiment I served to identify a class of reinforcing verbal 
stimuli and a class of nonreinforcing neutral stimuli. The words 
“good,” “great,” and “goody” were found to be reinforcing stimuli. 
On each of five trial blocks in Experiment II, 72 kindergarten and 
first-grade boys first observed a video film of a model and then were 
given the opportunity to play with the same materials viewed in the 
film. The model exhibited either an increasing or a constant pattern 
of critical responses over trial blocks, and the critical responses were 
followed by either verbal reward stimuli, verbal nonreward stimuli, 
or no stimuli. It was found that (a) only the verbal reward conse- 
quences produced learning in the observers and (b) observers in the 
increasing and constant pattern conditions exhibited increasing and 
constant patterns of responding, respectively. The results also sug- 


gest that observed reward serves as a cue for matching. 


It is generally accepted that much of the 
learning in a conventional classroom setting 
occurs through direct tuition and through 
differential reinforcement of desired be- 
haviors of the learner. However, since the 
correct or desired responses of the child are, 
in effect, neither immediately reinforced nor 
are they reinforced often enough, the child 
learns a great deal through observation of 
the experiences of others. These experiences 
include both the behaviors of others and 
the consequences to these behaviors. Some 
of the research that has been concerned 
with the process of learning through obser- 
vation has been focused on the effect of 
social consequences (i.e., praise or reproof) 


1 This paper is based on a thesis submitted by 
the author in partial fulfillment of the require- 
ments for the Master's degree at the University 
of Iowa. The author acknowledges the invaluable 
assistance of Richard A. Dubanoski and David A. 
Deni ie Wf" of the staff of the 

ongfellow school of the Iowa City Co i 
School District. E d 

* Requests for reprints should be sent to Yossef 
Geshuri, Psychology Department, Northwest 
Missouri State College, Maryville, Missouri 64468. 


to the behavior of the performer (model) 
on the subsequent imitative performance of 
the observer. Social reward, or praise, to the 
model has generally been shown to facilitate 
observational learning, but the details of 
this process are still unclear. This research 
investigated this phenomenon in greater 
detail. 

Recent experiments dealing with the effect 
of observed reward, or vicarious reinforce- 
ment, on observers’ imitative responses (e.8:; 
Kanfer & Marston, 1963; Marston, 1966; 
Rosenbaum & Arenson, 1968) have used à 
procedure in which the observer respond 
alternately with the model and demonstrated 
that reward to the observed model was ef- 
fective in producing learning in observers. 
The experiments showed that observers 1n & 
condition where some of the model’s Te 
sponses (critical responses) were followed by 
the word “good” exhibited a greater increase 
in the number of critical responses than 
observers in a no-consequence condition. On 
the basis of these findings it was concluded 
that observed reward served a reinforcing 
function. Such a conclusion, however, is nO 
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warranted. First, since the word “good” ap- 
peared in each trial block before, rather than 
after, the observer's response it could have 
served only a cue function, a funetion which 
any verbal consequence to the model’s 
critical responses might have provided. It is 
possible, however, that observed reward 
serves a cue function which other verbal 
consequences do not provide. Second, to 
assess whether observed reward serves a 
unique funetion would require contrasting 
observed reward with a control condition in 
which the critical responses of the model are 
followed by nonreward, neutral verbal 
consequences. In such a control condition, 
the reward property of the verbal conse- 
quences to the model would be eliminated 
while maintaining a potential cue function. 

Third, the studies cited above used only 
one word, “good,” as a consequence to the 
critical responses of the model. Thus, the 
difference obtained between the consequence 
“good” condition and the no-consequence 
condition may be limited to that specific 
word rather that to the stimulus class of 
Observed verbal rewards. A convincing 
demonstration that the performance of the 
observers is due to observed verbal rewards 
would require, among other factors, that the 
words delivered to the model consists of 
several instances of the class of reward ver- 
balizations rather than only one word. 

The findings of the studies cited above also 
suggest that the model’s pattern of respond- 
ing influences the observers’ performance. 
One study (Rosenbaum & Arenson, 1968), 
in which the model displayed a constant 
number of critical responses over trial blocks, 
has shown that observers’ performance of 
critical responses remained constant over 
trial blocks. In the other studies, where the 
model displayed an increase in critical re- 
Sponses over trial blocks, the observers per- 
formed an increase in critical responses over 
trial blocks (Kanfer & Marston, 1963; 
Marston, 1966; Marston & Kanfer, 1963; 
and Phillips, 19682, 1968b). Since each of 
those studies used only one pattern, the 
effect of different response patterns of the 
Model on observers’ performance has not 
been directly compared. Thus, two response 
Patterns were included in this study, an in- 
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creasing and a constant number of critical 
responses over trial blocks. 

The primary purpose of this study was to 
examine the effect of observed verbal reward 
(ie. a class of identifiable reward stimuli, 
rather than one word) to a model on an 
observer’s performance while controlling for 
the presentation of verbal consequences to 
the critical responses of the model. The 
secondary purpose was to examine the effect 
of the model’s pattern of responding on an 
observer’s performance. Experiment I served 
to identify a class of reinforcing verbal 
stimuli and a class of nonreinforcing, neu- 
tral, stimuli, Experiment II examined the 
effects of these two classes of consequences 
to the model, and the model’s patterns of 
responding on observers’ performance. 


ExPERIMENT I 


In order to identify a class of reinforcing 
stimuli and a class of nonreinforcing stimuli, 
five words, monosyllabic or bisyllabie adjec- 
tives or adverbs that commonly denote 
praise or approval, were selected as potential 
reinforcers. The words were “good,” “great,” 
“nice,” “fine,” and “goody.” Five additional 
words, also mono- or bisyllabic adjectives or 
adverbs, were selected as potential non- 
reinforcers on the basis of having low fre- 
quency of associations with approval or 
disapproval (Palermo & Jenkins, 1964). 
These words were “even,” “when,” “some,” 
“ever,” and “about.” 

The reinforcing effect of each of these 
words was assessed in a two-choice dis- 
crimination task; each word was presented 
to the subjects upon the emission of correct 
responses. To fulfill the criterion for rein- 
forcement, it was necessary to show that 
subjects would increase the number of 
correct responses as a function of receiving a 

icular word. For nonreinforcement, the 
word should have no effect on the subjects’ 
performance. 


Method 


Subjects. Sixty preschool children, 80 boys and 
30 girls, were randomly assigned to 10 groups, 3 
boys and 3 girls per group. 

Procedure. The stimuli were round blocks, each 
4 inches in diameter and painted either white, 
black, blue, or yellow. They were presented, 
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two at a time, black and white (S+) or blue (St) 
and yellow, on a 20 by 22 inch turntable. An 18 
by 22 inch screen in the middle of the turntable 
prevented the subjects from observing the prep- 
aration of the stimuli for the subsequent trial. 
The subjects, tested individually, were in- 
structed to select one block of a pair of blocks on 
each of 30 trials. One half of the subjects was 
tested on the black and white blocks and the 
remainder on the blue and yellow blocks. The 
position of the stimuli on each trial was randomly 
determined with the one restriction that the 
correct stimulus appeared five times on each side 
in each block of 10 trials, and this sequence was 
used for all subjects. The experimenter delivered 
a verbal stimulus following each selection of a 
correct block in a typical noncorrection discrimi- 
nation learning procedure. Ten words, 5 potential 
reinforcers and 5 potential nonreinforcers, were 
employed, and each word was used with six sub- 
jects, for a total of 10 experimental groups. 


Results and Discussion 


Table 1 shows the mean number of correct 
responses obtained on three blocks of 10 
training trials under each of the 10 word 
conditions. An overall analysis of variance, 
performed on the number of correct re- 
Sponses, revealed a significant, interaction 
between words and trial blocks (F = 3.12, 
af = 18/100, p < .001). This interaction 
indicated that the verbal consequences to 
the correct responses of the subjects had 
differential effects over trial blocks. Follow- 
up analyses revealed that the number of 
Correct responses increased significantly 
over trial blocks only for the groups “good” 
(F = 7.95, df = 2/10, p < .025), “great” 


TABLE 1 
Muan NumsER or Correcr REsPonsEs 


Trial blocks 


Condition 


Some 
Ever 
About 
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Goody | 6 33|2 1 
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(F = 6.41, df = 2/10,p < .05), and “goody” 
(F = 13.07, df = 2/10, p < .005). Thus, 
only these words fulfilled the reinforcement 
criterion. Subjects in the “even,” “when” 
and “ever” conditions showed the least 
deviation from chance performance, hence, 
these words best fulfilled the nonreinforce- 
ment criterion. 


Experiment II 


This experiment examined the effect of 
Observed reward to a model and model's 
pattern of responding on the subsequent 
performance of the observers. To study the 
effects of observed reward, three conse- 
quence conditions were introduced; reward 
consequence, nonreward consequence, and 
no consequence. For reward and nonreward 
verbal consequences, the words used were 
those identified in Experiment I as rein- 
forcers and nonreinforcers. If the cue func- 
tion of observed reward stems from the fact 
that the consequence to the model is a rein- 
forcing event, then the reward condition 
should facilitate the observers’ performance 
of critical responses more than either the 
neutral statement condition or the no- 
consequence condition, and the latter two 
conditions should not differ from each other. 
However, if the cue function of observed 
reward is unrelated to the fact that the 
consequence to the model is a reinforcing 
event, then the reward consequence con- 
dition and the nonreward consequence 
condition should facilitate observers’ per- 
formance more than the non-consequence 
condition and the former two conditions 
should not differ from each other. 

To study the effect of different patterns of 
the model’s responding on an observers’ per- 
formance, the model exhibited either an 
increasing pattern of critical responses Or & 
constant pattern. The total number of 
critical responses performed by the model 
was the same in both conditions, hence, 4 
difference in performance occurring between 
the observers exposed to these conditions 
would be attributed to the different pec 
of responding rather than to different num- 
bers of observed critical responses. If the 
increase in critical responses over t 
blocks exhibited by the observers in P 
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research was due to the model's increasing 
pattern of critical responses, then such 
performance should occur in the increasing 
pattern condition but not in the constant 


pattern. 


Method 


Observers and model. The observers, 72 kinder- 
garten and first-grade boys, were randomly as- 
signed to six groups of equal size. The model was 
a 10-year-old boy. 

Apparatus. The experiment was conducted in a 
laboratory trailer consisting of an experimental 
room and an observation room. The experimental 
room contained a table, a chair, and a television 
monitor, and the observation room contained a 
video tape recorder. An 18-inch high vertical 
panel was mounted at the rear of the table. Cen- 
tered on the panel was a 9 by 9 inch inclined 
Plexiglas board with a 5 by 5 matrix of hooks. 
On the table were five different kinds of stimuli, 
25 of each kind: rings, 1.5-inches in diameter; 
wood tubes, 1-inch long and .5-inches in diameter; 
75-inch cubes, each with a hole in the middle; 
large paper clips; and 1.5 by 1.5 by 1.5 inch wooden 
shapes called horseshoes. A removable screen 
covered the table and its contents. The television 
monitor was 5 feet away from one side of the table 
80 that the observer, while sitting at the table, 
could alternately watch television and manipu- 
late the stimuli on the table by turning his head. 

Procedure. The observer was seated in front of 
the covered experimental table but facing the 
television set. He was told to watch television 
when it was on, and when the television was off 
he was to play? with the toys on the table after 
the screen covering the table was raised. In addi- 
tion, the observer was told that he would alternate 
between watching television and playing, and that 
à red light above the hookboard would signal him 
when to stop playing and attend to the television. 

e experimenter operated the apparatus from 
behind the table and could not be seen by the 
observer. 

Each observer watched one of six films, each of 
which constituted an experimental condition. 
Each condition consisted of a kind of pattern of 
~ 


* Rosenbaum and Arenson (1968) offered an 
alternative interpretation of the previously ci: 
Tesearch on observed reward. They suggest that 
observers provide word associations to the critical 
Tesponses of the model (e.g., if the model emits 

uman words, then observers would tend to emit 
Words of the same class), and thereby, an increase 
ìn the performance of critical responses by 
model would evoke an increasing number of word 
associations within the class of critical responses 
{oF observers. To eliminate the possibility of the 

Ormation of word associations to the critical re- 
Sponses of the model, a motor task was employed 
in the present study. 
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critical responses exhibited by the model and a 
type of response consequence to the model. In 
three of the films, the model performed an increas- 
ing number of critical responses (2, 3, 4, 5, and 6) 
over five blocks of 10 trials; while in the remaining 
trials in each block, the model performed the non- 
critical responses. In the other three films the 
model performed a constant number of critical 
responses which consisted of four critical responses 
and six noncritical responses in each of the five 
trial blocks. The critical response was defined as 
placing a horseshoe on the hooks, and the non- 
critical response as hanging the rings, paper clips, 
blocks, or tubes on the hooks. In the reward condi- 
tion, each critical response was followed by a 
reward stimulus (the words “good,” "great" and 
“goody” were used in random order); in the non- 
reward condition the critical response was fol- 
lowed by a nonreward stimulus (the words “even,” 
“when” and “ever”); and in the no-consequence 
condition no words followed the critical responses. 

Each film was 4.5 minutes long and was accom- 
panied by music. The opening scene was a 15- 
second scan of the experimental setting in which 
the model was sitting at the table facing the tel- 
evision camera and an adult male was sitting 3 
feet to the side of the model. After the scan, the 
camera focused only on the model and the table, 
and the model began to place the different stimuli 
on the hooks. 

Following each block of 10 responses performed 
by the model, the film stopped and the screen cov- 
ering the table was raised. The experimenter re- 
corded the type of stimuli that were hung on the 
hookboard. Following the tenth response, the red 
light was turned on, the screen lowered, and the 
next video sequence began. During the television 
sequence the experimenter returned all the stimuli 
to their respective piles on the table. 

The design of Experiment II was a3 X 2 X 5 
factorial with the three levels of the first factor 
being of the consequence treatment; reward, non- 
reward or no consequence to the model. The sec- 
ond factor was a response pattern variable with 
either an increasing or & constant pattern of re- 
sponding. The third factor, trial blocks, consisted 
of five blocks of 10 trials each. 


Results 

Table 2 shows the summary of the analy- 
sis of variance on the frequency of critical 
responses. The significant Consequence X 
Pattern interaction (see Figure 1 which 
represents the mean number of critical 
responses over the five trial blocks for the 
six imental groups) suggest a differ- 
ential effect of the three consequence con- 
ditions under the increasing pattern but not 
under the constant pattern. Further analyses 
were performed on the two pattern con- 
ditions separately. No significant effects 
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TABLE 2 
Summary OF THE ANALYSIS OF VARIANCE 
Source df MS F 
Between subjects 71 
Consequences (B) 2 36.74 8.69*** 
Patterns (C) 1 20.54 4.86* 
BXC 2 21.81 5.16** 
Error (b) 66 4.23 
Within subjects 288 
"Trial blocks (A) 4 | 21.93 | 18.61*** 
AXB 8 3.98 | 3.38*** 
AXC 4 15.74 | 13.36*** 
AXBXC 8 2.20 1.87 
Error (w) 264 1.18 
Total 359 
*p = .05. 
** p - 01. 
*** p = 001. 


were found for the analysis involving the 
constant pattern condition. 

For the analysis involving the increasing 
pattern condition a significant main effect of 
consequences (F = 17.64, df = 2/33, p < 
001) indicated that the observers’ per- 
formance differed under the three conse- 
quence conditions. Examination of this 
difference with a Scheffé test revealed that 
the increasing reward condition performed a 
significantly greater number of critical 
responses than either the increasing non- 
reward or increasing no-consequence con- 
dition (p < .05). No difference was found 
between the latter two conditions. 

The analysis for the increasing pattern 
also showed that the observers’ performance 


reward versus increasing no-consequence 
comparison (F = 9.69, df = 4/ 88, p < 001). 
These findings indicate that the observed 
reward stimuli in the increasing pattern 
facilitated the observers’ performance over 
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trial blocks more than either the nonreward 
or no-consequence conditions to the model, 

Past research provided a basis for ex- 
pecting that the observers in an increasing 
pattern would exhibit an increase in the 
number of critical responses over trial blocks, 
and observers in à constant pattern would 
not show an inerease. A significant Trial 
Blocks X Pattern interaction (F = 13.36, 
df = 4/263, p < .001), which was obtained 
for the overall analysis of variance, indicated 
that the observed patterns of responding 
had differential effects on the observer's 
performance. 


DISCUSSION 


"This research both confirmed and qualified 
the results of past research. The findings 
show that observers in the increasing reward 
condition performed a greater number of 
critical responses than observers in the in- 
creasing nonreward or no-consequence con- 
dition, while no difference occurred between 
the latter two conditions. These findings 
indicated that the facilitating effect of ob- 
served consequences to the model on the 
observers’ performance may be attributed 
to a specific class of verbal stimuli, namely 
reward words, rather than any kind of 
verbal statement. Thus, it appears that the 
cue function of observed reward stems from 
the fact that the observed consequences to 
the model are reinforcing events. The fa- 
cilitating effect of observed reward on the 
observers’ performance occurred only when 
the model exhibited an increasing pattern of 
critical responses. This suggests that the 
effect of observed reward depends on the 
pattern of responding exhibited by the 
model. 

This research shows that the response 
patterns of the model influences the sub- 
sequent performance of observers. Experi- 
ment II showed that observers performed 
an increasing number of critical responses 
over trial blocks when exposed to them in a 
increasing pattern and a constant number i 
critical responses when exposed to a constan! 
pattern. The findings of this study, which 
provided a direct comparison of two pa 
of responding, corroborate the findings dt 
past research in which the model's pattern 0 
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INCREASE CONSTANT 
e——e-—-0 REWARD 
m——3À2-— -30 NONREWARD 


4——4- —-^ NO CONSEQUENCE 


[S] Js oO 


MEAN NUMBER OF CRITICAL RESPONSES 
m 


3 4 5 


TRIAL BLOCKS 


Fic. 1. Mean number of critical responses for the six experimental groups as a function of trial 


blocks. 


Tesponding appeared to influence the ob- 
Servers’ performance. 

The findings showing that the model’s 
iem of responding influences the per- 
Ormance of the observers suggest that 
Observers match the pattern of the critical 
fbonses of the model. Such an interpreta- 
ra was offered by Phillips (1968b). Evalua- 
On of this interpretation would require 
assessing how accurately each observer 
em the model’s performance rather 

an ing the mean performance of 
à group of observers. Such an analysis, which 

Dot been reported in past research, was 
Performed in this study by comparing the 


number of critical responses performed by 
an observer in each trial block with the num- 
ber of critical responses exhibited by the 
model in the corresponding block. It was 
found that seven observers in the increasing 
reward condition closely matched the ob- 
served incidence of the critical responses (i.e., 
the number exhibited by the observer dif- 
fered from that of the model by no more 
than one critical response in any trial block). 
In the increasing nonreward and no-conse- 
quence conditions a total of two observers 
closely matched the observed incidence of 
critical responses, which was significantly 
less than the number of observers who closely 
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matched in the increasing reward condition 
(X? = 848, p < .01). In the three con- 
Stant conditions, only two observers closely 
matched the model's performance. 

The above results suggest that the close 
matching that occurred in the reward con- 
dition may have been due to the observed 
reward serving as a cue for matching. This 
interpretation, however, fails to account for 
the findings that close matching did not 
occur in the constant reward condition. 
Thus, these data allow the general conclusion 
that the influence of observed reward de- 
pends on the pattern of the model's re- 
sponding. Further investigation of the effect 
of observed reward on matching should 
employ other patterns of the model’s re- 
sponding to determine how these patterns 
interact with observed reward in deter- 


mining matching. 
This research demonstrates, in effect, the 
Occurrence of discrimination learning 


through observation, a finding that appears 
to have wide applicability to classroom 
learning. The research also suggests that in 
classroom learning, much of which occurs 
through observation of the performance of 
others, the systematic occurrence of positive 
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consequences to certain responses of some 
pupils may evoke discriminative observa- 
tional learning in other pupils. 
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With a sample of 64 secondary school classes in Montreal, the pre- 
dictive validity of a set of 15 learning environment scales was tested 
in eight subject areas: physics, chemistry, biology, geography, math- 
ematics, English literature, history, and French. Regression analyses 
showed the incremental validity of the scales administered at mid- 
year to account for end-of-year achievement variance in class means 
beyond that accounted for by IQ. The results were cross-validated 
on random split samples. The environment scale correlations with 
achievement are constant across class mean IQ levels and nearly con- 


stant across subject areas. 


Recent studies (Walberg, 1969; Walberg 
& Anderson, 1968) of two independent na- 
tional samples of high school physics classes 
have shown that measures of student per- 
ceptions of the learning environment predict 
cognitive and affective outcomes even when 
measured intelligence, prior achievement, 
and interest in the subject are held constant. 
The main purpose of the present research 
was to determine the power of the environ- 
ment scales to predict achievement in other 
areas of the high school curriculum. 

As argued elsewhere, it has been difficult 
to find systematic variations in properties of 
classes, such as grouping and size (Stephans, 
1968) and teacher behavior (Rosenshine, 
1970; Walberg, 1969), that are consistently 
related to achievement outcomes. Moreover, 
teacher observation systems require trained 
Co to visit classes several times. 
quvtonment scales, on the other hand, tap 
the student’s perceptions of a wide range of 
instructional and social cues relevant to his 
own learning; are fairly convenient to ad- 
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minister; and have incremental predictive 
validity in physics classes. Thus, if it can be 
shown that they are valid for predicting 
achievement in other subjects, it might be 
concluded that they are promising as effici- 
ent and useful tools for general research on 
instruction and learning. 


Merrxop 


Sample 

Data on the environment and IQ were obtained 
in midwinter from eight English-speaking sec- 
ondary schools in the Montreal metropolitan area. 
Classes were sampled randomly within schools 
and represented the subject-matter areas of phys- 
ics, biology, chemistry, geography, mathematics, 
English literature, history, and French (See 
Table 4 for the numbers in each subject). The 
students in the 64 classes ranged from 15 to 17 years 
of age and were either in their tenth or eleventh 
year of school. 


Instrument 


The Learning Environment Inventory scales 
concern the relationships of the pupils to one an- 
other, to the organizational properties of the class, 
to class activities, and to the physical environ- 
ment. Each of the scales consists of seven state- 
ments descriptive of high school classes. The 
respondent expresses the extent of his agreement 
or disagreement with each item on a 4-point scale. 
For each of the 15 scales, the mean response on 
the seven items is calculated and the mean of all 
student ratings in the class provides an estimate 
of the collective student perception of the class- 
room environment. The alpha reliabilities for the 
scales for classes range between 5 and 8 (see 
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Walberg, 1969 for exact figures, sample items, and 
the conceptual framework of the instrument). 

The High Sehool Leaving Examinations of the 
Province of Quebec were selected as the achieve- 
ment criteria. These were administered to all 
students at the end of the spring term and serve 
as & primary criterion for employment and admis- 
sion to universities. The tests are constructed and 
Scored under supervision of the Quebec Depart- 
ment of Education. Scores are standardized in 
each subject area to a mean of 65 and standard 
deviation of 15. The tests consist of about half 
multiple-choice and half essay questions. Al- 
though internal consistency reliabilities were not 
available for the present samples, estimates of 
recent test forms, that do not change much from 
year to year, ranged from .7 to .8. 


Procedure 


The Learning Environment Inventory and IQ 
Scores were obtained" during a 1-month period 
in December. While approximately 7595 of the 
pupils in each elass completed the Inventory, the 
remaining 25% took the Henmon-Nelson Test of 
Mental Abilities, in a system of randomized data 
collection within classes used previously (Walberg 
& Anderson, 1968). Since the mean class size is 25, 
the environment is estimated using about 20 pupils 
per class, while the class mean IQ is an estimate 
based upon approximately five individual scores. 

mean achievement score for students who did 

not take the Learning Environment Inventory is 
based on about six or seven individual scores per 
class, since it includes those who took the IQ test 
instead of the Inventory and those who were ab- 
Sent on the day of the December test ad- 
ministration. 
_ Class means were computed on IQ, the 15 Learn- 
ing Environment Inventory scales, and the 
achievement tests for students not administered 
the Inventory, and also for the total class. Since 
the standardization of the tests may not have been 
perfect; that is, means in different areas may have 
differed, seven “dummy” (0 or 1) yariables (see 
Cohen, 1968) were generated to identify each class 
uniquely with one of the eight subjects. The re- 
gression of the achievement criteria on the seven 
dummy variables produces F ratios equivalent to 
a one-way analysis of variance (with eight levels) 
for the between-subject variance in achievement 
(see Walberg, 1971), 

_A set of products between the 15 Learning En- 
vironment Inventory scales on one hand and IQ 
and the seven dummy subjects variables on the 
other was also generated. Tests of th 


HERBERT J. WALBERG AND GARY J. ANDERSON 


TABLE 1 
INCREMENTS IN ACHIEVEMENT VARIANCE (Rì) 
AccoUNTED FOR BY COMPLETE 
REGRESSION MoDEL 


Achievement variance® 
Added to ee | 
model de | 
class not in each 
taking LET class 
Subjects 7/56 12.14 13.10 
IQ 1/55 6.93* 12.43* 
LEI 15/40 42.53** 46 .23** 
LEI x IQ 15/25 18.38 10.68 
Total 38/25 79.99** 82.35** 
R 38/25 .8944** .9074** 


Note.—Abbreviations: LEI = Learning En- 


vironment Inventory. | 
* In percentages. 
*p = 05. 
** 7 01. | 


RESULTS AND DISCUSSION 


The results of the complete regression 
model predicting the mean achievement of 
students not taking the Learning Environ- 
ment Inventory and the mean achievement 
of the total class are shown in Table 1. The 
first step, entry of the subject dummy vari- 
ables, is not significant; this suggests that 
the tests are fairly well standardized and 


TABLE 2 
INCREMENTS IN ACHIEVEMENT Variance (R°) 
ACCOUNTED FOR BY REDUCED | 
ReeREssion MopEL 


Achievement variance* 


Added 
to [2 0^ ET Ke 
e Non-LEI M Total M 
I 1/62 9.33** 14.58** 
* Step-down F ratio = 3.77* 
ET qe 770 
LEI 15/47 47.59** 50.76** 
á Step-down F ratio = 3.09** 
inco el at le OES EE 00i 
Total | 16/47 56.92** 65.35** 
Step-down F ratio = 3.25"* 
R 16/47 .7545** .8084** 


Note.—Abbreviation: LEI = Learning Ea 
vironment Inventory. 

* In percentages. 

*p = .05. 

p> = 01. 


PROPERTIES OF THE ACHIEVING URBAN CLASSES 


TABLE 3 
INCREMENTS IN ToTAL MEAN ACHIEVEMENT Variance (R?) ACCOUNTED 
For IN SPLIT SAMPLES 


Added to model Original 
af 
A 
Model I 
LEI alone 15/16 75.67** 
R 15/16 .8699** 
Model II 
IQ 1/30 17.87* 
LEI 15/15 57.80* 
Total 16/15 75.67* 
R 16/15 .8699* 
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Achievement variance* 
Cross-validation 

Sample vi 

» Sannie fanol, 
60.24 1/30 41.65** 17.41* 
«1762 1/30 .6454** .4173* 

5.81 1/30 17.87* 5.81 
55.35 1/29 26.44** 12.72* 
61.16 2/29 44.21** 18.53* 
.7821 2/29 .6656** .4304* 


Note.—Abbreviations: LEI = Learning Environment Inventory. 


* [n percentages. 
*p = 05. 
“p= Ol. 


differences in mean achievement among the 
eight subject areas are not appreciable. 
However, entries of IQ and the set of 15 
Inventory scales each produced significant 
increments in explained achievement vari- 
ance on both criteria. In the last step, entry 
of the 15 products of IQ and Learning En- 
vironment Inventory scales was not signifi- 
cant; this indicates that the regression slopes 
for achievement on the Inventory scales can 
be regarded as homogeneous across different 
levels of IQ. Since the subject and product 
terms were not significant, they were de- 
leted from subsequent analyses, and a more 
parsimonious regression model was used. 
Table 2 shows that the reduced 16-term 
model accounts for a significant and sizable 
amount of the achievement variance on both 
ioni and that the Learning Environment 
nventory scales account for a major frac- 
re of this variance. It may be noted that 
€ prediction of the total class mean is more 
accurate than the mean of students who did 
Dot take the Learning Environment Inven- 
"n the step-down F ratios (Bock, 1963) 
Showed that both IQ and the set of Inven- 
tory scales contributed significantly to the 
Prediction of the total mean even after par- 
laling the covariance of the non-Learning 
pes oneni Inventory mean from this cri- 
rion. Since both IQ and the Inventory 


were less predictive of the non-Inventory 
mean, it might be speculated that the rela- 
tive inaccuracy is attributable to the lower 
reliability of this criterion (stemming from 
the smaller sample of students used to esti- 
mate this class mean) rather than response 
bias common to the Learning Environment 
Inventory and achievement. 

Table 3 shows the double cross-validation 
results on two random halves of the whole 
sample. The first regression model, employ- 
ing only the Inventory scales, was significant 
for both split-sample cross-validations; 
Sample A weights cross-validated on Sample 
B data, and vice versa. Moreover, on the 
second model, there is evidence for a more 
stringent test of validity, what might be 
termed “cross validated incremental valid- 
ity”; the Sample A weighted Inventory 
predictor variate accounts for significant 
variance beyond IQ on the Sample B data, 
and vice versa. 

In the last series of analyses, the following 
terms were added to the regression model: 
the seven dummy subject variables, IQ, one 
Inventory scale, and the seven products of 
the Inventory scale with the subject vari- 
ables. The last column in Table 4 shows 
that the products contribute little to the 
accountable variance; thus, the regression 
slopes of total mean achievement on the 15 
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PROPERTIES OF THE ACHIEVING URBAN CLASSES 


learning Environment Inventory seales are 
not significantly heterogeneous aeross sub- 
ject areas with two minor exceptions; En- 
vironment and Friction. Examination of the 
correlations in the separate subjects reveals 
that books, materials, and working space in 
the learning environment and the absence of 
friction among class members appear to be 
more important in mathematics, physics, 
and history, than in other subjects. Aside 
from these exceptions, the three columns of 
overall correlations between the Inventory 
scales and mean achievement (total class, 
those in each class not taking the Inventory, 
and total class with IQ partialed out) sug- 
gested that classes that students rated 
higher on Intimacy, Environment, Satisfac- 
tion, and Democracy, and lower on Speed, 
Friction, Favoritism, Cliqueness, Disor- 
ganization, and Apathy tended to score 
higher on the standardized achievement 
tests. 


CONCLUSION 


The predictive validity of the environ- 
ment scales has been shown in several fairly 
stringent probes. The scales administered to 
à random sample of students in the class 
predicted the mean achievement of the 
students not administered the scales. The 
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incremental predictive validity (accounting 
for achievement variance beyond that ac- 
counted for by IQ), and the split-sample and 
incremental split-sample cross-validations 
also proved significant. The relationships be- 
tween the scales and achievement were con- 
sistent across classes of different mean IQ 
levels and nearly constant across the subject 
areas. 
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CHANGES IN MEMORY ATTRIBUTE DOMINANCE 


AS A FUNCTION OF AGE 


JOEL 8. FREUND! anp JUDY W. JOHNSON? 
University of Arkansas 


Developmental changes in the relative dominance of three attributes 
of memory were investigated. It was hypothesized that first-grade 
children would code words primarily on the basis of their orthographic 
features, while in third-grade children, the acoustic attribute would 
dominate, and that college subjects would show a preference for the 
associative attributes. The nature of the errors on a multiple-choice 
recognition test was used to assess the relative dominance of the three 
attributes. First-grade subjects made more errors on the orthographic 
than acoustic distractors while the third-grade subjects made equal 
numbers of errors on these two types of distractors. The lack of a 
preponderance of associative errors on the part of the college sub- 
jects was attributed to the nature of the words used as associates. 


Underwood (1969) has proposed a theory 
of memory according to which information 
or events are stored and retrieved by means 
of various attributes. These attributes rep- 
resent different types of encoded informa- 
tion that can serve to discriminate one 
memory from another, and to act as re- 
trieval mechanisms for a particular mem- 
ory. 

The present study was concerned with 
only three types of attributes, namely, the 
acoustic, the orthographic, and the verbal 
associative. The acoustic attribute for the 
memory of a word is the word’s sound 
patterning when pronounced. The ortho- 
graphic attribute is the letters and the con- 
figuration of the word, while the associa- 
tive attribute consists of other words which 
may be elicited by the target word. 

One implication of this theory is that 
the attributes children use in establishing a 
memory may change with age, or age-re- 
lated experiences (e.g., School). Specifically, 


1 Requests for reprints should be sent to Joel S. 
Freund, Department of Psychology, Univeristy of 
Arkansas, Fayetteville, Arkansas 72701. 

2The authors would like to thank Walter 
Brooks, Principal of the Bates Elementary School 
and Harry Vandergriff, Superintendent of the 
Fayetteville Publie Sehools, and their cooperative 
teachers for making the students available for 
this study. 
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Underwood (1969) points out that in young 
children the acoustic and spatial attributes 
may be dominant while in adults, the ver- 
bal-associative attributes will dominate. 
Thus, one might expect an increase in the 
use of verbal-associative attributes and a 
decrease in the use of the acoustic attri- 
butes with inereasing age and/or school ex- 
perience. 

Bach and Underwood (1970) found such 
a developmental change in the dominant 
attributes between second- and sixth-grade 
subjects. Using a multiple-choice recogni- 
tion task similar to that used by Under- 
wood and Freund (1968), they found that 
second-grade subjects made more errors on 
the acoustically similar distractors than on 
the associative distractors, while the Te 
verse was true for the sixth-grade subjects. 
From this it was concluded that there was 
a shift in the dominant attribute, from 
acoustic to associative as education levi 
increased from the second to the sixth 
grade. 

Examination of the words used by Bach 
and Underwood (1970) reveals that the 
acoustically similar distractors were %80 
orthographically similar to the correc 
words. It is possible, then, that the sub- 
jects may have chosen the acoustic distract- 
ors partially because of their orthographi¢ 
similarity to the correct words. Evidence 
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from other investigators indicates that 
young children may depend heavily on an 
orthographie attribute. Marchbanks and 
Levin (1965) found that kindergarten and 
first-grade children used specific letters for 
recognition of nonsense words; the first 
letter used predominantly, with the last 
letter being second in importance. Willi- 
ams, Blumberg, and Williams (1970) also 
found these same letters to be the basis for 
recognition of nonsense words among socio- 
economically deprived children. These re- 
sults lead to the expectation that young 
children’s primary memory attribute may 
be orthographic rather than acoustic. 

The present study was an attempt to 
determine the relative importance of the 
orthographic, acoustic, and associative at- 
tributes for memory in young children, and 
to see if there is a shift in the dominant at- 
tribute with increasing age and educational 
experience. For this purpose, three age 
groups were tested, first-grade children, 
third-grade children, and college sopho- 
mores. It was expected that in the first- 
grade subjects, the orthographic attribute 
would be dominant over both the acoustic 
and verbal-associative attributes, but that 
the third-grade subjects would show a de- 
crease in orthographic coding and an in- 
crease in acoustic and associative coding. 
Underwood and Freund (1968) found that 
the associative attribute was the dominant 
attribute for recognition in college-stu- 
dent subjects, and this result was also ex- 
pected in the present study. 

The procedure previously used to detect 
differences in attribute dominance in rec- 
ognition memory of college students and 
children (Bach & Underwood, 1970; Un- 
derwood & Freund, 1968) was modified and 
Used here. In the learning phase, the sub- 
ject was presented 40 words, 1 at a time. 
In the test phase, the subject was presented 
40 sets of five words, each containing the 
correct word from the learning trial, an 
associate of the correct word, a word acous- 
tically similar to the correct word, a word 
orthographically, but not acoustically sim- 
ilar to the correct word, and a neutral word, 
which had no relation to the correct word. 
Changes in the dominance of memory at- 
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tributes as a function of age should be de- 
tected by differences in types of errors made 
on the recognition test. 


MzrHOD 


Subjects 


There were 25 subjects in each group. The 
first-grade subjects (mean age was 6 years 6 
months) and the third-grade subjects (mean age 
was 8 years 7 months) were from the same school 
within the publie school system. The enrollment 
of this school represents a socioeconomic cross- 
section of the city’s population. The college sub- 


TABLE 1 

Worps Usen on Recoenition Test 
Correct | Associate | Acoustic ets Neutral 
mad happy had mud dime 
bird pigeon heard | bold rest 
born baby horn barn fish 
play game may pony sky 
snake | afraid take slide grow 
come on from cone much 
witch | fire ditch watch | start 
day night lay dry room 
walk run talk work moon 
look see book luck then 
clock | time rock clerk idea 
stop go hop step is 
hard soft card hand miss 
pool swim school | pull whiz 
big little pig bag wet 
bark dog spark | back left 
wood | tree could | wind him 
ride car hide race just 
jeep truck peep jump grass 
dark light lark duck page 
purse | money nurse plane | corner 
fan air pan fun but 
crown | king down clown | feel 
gate shut skate give chase 
mark | write shark | milk apple 
rose red nose rope sing 
hat cap sat hot did 
best good nest belt over 
spell word tell shell white 
bed sleep said bad rain 
will you hill well sun 
boot shoe suit boat people 
way home stay why told 
ball park call bell wise 
far near star for help 
slow fast no show as 
coat sweater | goat colt read 
here there ear have space 
new old two now cry 
cake eat make came all 
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jects (mean age was 20 years 2 months) were 
sophomores from the general psychology classes 
at the University of Arkansas. 


Materials 


The 40 words presented for learning and the 
additional 160 words used on the recognition test 
were selected from the readers of the first-grade 
subjects whose reading-word population consisted 
of a total of 464 words (O'Donnell, 1966). Those 
words are presented in Table 1. The word associa- 
tion tables and other information from Entwisle 
(1966) were used in choosing words for the associ- 
ate category. The words chosen for the neutral, 
orthographie, and acoustie eategories were from 
the present experimenters' judgments and Bach 
and Underwood (1970). The words in the acoustic 
category were chosen so as to have a similar 
sound, but to start, with a different letter than the 
correct word. The words in the orthographic 
category were chosen to have the same first letter, 
the same last letter, the same number of letters, 
and, when possible, the same second letter as the 
correct word. They were also chosen to have 
acoustic features as dissimilar from the correct 
word as possible, 


Procedure 


The subjects were tested individually in an 
isolated room at their respective schools. At the 
beginning of the learning trial the subjects were 
instructed to try to remember each word so that 
later they could pick the word from among other 
words. Each word was printed in lower case on a 
sheet of 834 X 12 inch paper, and each word was 
presented for a period of 5 seconds. When the word 
was presented, both the subject and the experi- 
menter pronounced it. The only difference in the 
procedure with college subjects was that the 
words were not pronounced when presented. A 
single random order of the 40 words was used on 
the learning trial. The recognition test began im- 
mediately after the learning trial was completed. 
On the recognition test, the five words were 
printed in lower case letters on a sheet of 12 X 18 
inch paper, and the subject was asked to pick the 
word he had been shown earlier. The subjects 
were not timed, but a choice was required on each 
set. The order of the words on the recognition test 
was random, With the restriction that 10 of the 20 
words occurring during the first half of the learn- 
ing trial appeared during the first 20 tested for 
recognition, and 10 occurred during the last 20. 
Therefore, 10 of the last 20 words on the learning 
trial occurred during the first half of the testing 
and 10 occurred during the last half. 


RESULTS 


As was the case in the Bach and Under- 
wood (1970) study, it is not known if the 
associative, acoustic, and orthographic dis- 
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tractors are equivalent in level or degree of 
similarity to the correct response. There- 
fore, to say that at a given grade level one 
of these attributes dominates can have 
meaning only in relation to the materials 
used here. The hypothesis considered here, 
however, states that there will be changes 
in the relative frequency with which differ- 
ent types of errors occur at the three grade 
levels. Thus, the prediction is that there 
will be an interaction between grade level 
and error type. Any differences in absolute 
level of error frequency for the three types 
is irrelevant. 

'The mean numbers of recognition errore 
on each type of distractor for each grade 
level are presented in Table 2. There was a 
signifieant decrease in the total number of 
errors as grade level increased (F = 89.92, 
df = 2/72). It also can be seen that ap- 
proximately 47% of the errors made by 
the first-grade subjects were made in choos- 
ing the orthographic distractors, while the 
associative, acoustic, and neutral words 
were each chosen about 18% of the time. 
This was not the case for the third-grade 
and college subjects. Only about 28% of 
the errors made by the third-grade sub- 
jects were made by choosing the ortho- 
graphic alternatives, while the associative, 
acoustic, and neutral words were chosen 
28%, 29%, and 15%, respectively. Sim- 
ilarly, only about 14% of the college sub- 
jects’ errors fell on the neutral items, the 
rest being split equally among the remain- 
ing three alternatives. 3 

A 3 X 5 repeated measures analysis of 
variance was performed on the number of 
errors with the predicted interactions being 
tested by orthogonal comparisons. There 


TABLE 2 


Mean NuwnzR or Errors on Eacan TYPE OF 
Distractor ror EaAcH AGE GROUP 


Type of distractor 
olin te aR — 
Age group 
Asso- | Acous- | Ortho- | Neutral| Total 
ciate tic |graphic 
stis | oes 
First grade. | 6.2 | 6.7 | 9.1 | 6.0 a 
Third grade | 4.1 | 4.2 | 4.0 | 2.2 | 14 
College 2.8 | 2.3 | 2.6 . 1 
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were significantly more errors made on the 
orthographic than on the acoustie distrac- 
tors (F = 7.17, df = 1/72). The interac- 
tion of this comparison (acoustic vs. ortho- 
graphic) with grade level was also signifi- 
cant (F — 6.41, df — 2/72), due primarily 
to the large number of errors made on the 
orthographic alternatives by the first-grade 
subjects. 

A second analysis of variance was per- 
formed on the first- and third-grade sub- 
jects alone, with similar results. The first- 
grade subjects produced more orthographic 
than acoustic errors, while the third-grade 
subjects showed equal frequency of errors 
on these two types. This interaction of 
error type with grade level was significant 
(F = 9.13, df = 1/48). 


Discussion 


One objective of the present study was 
to determine if there are developmental 
changes in the relative dominance of the 
orthographic and acoustic attributes in 
young children, The results clearly indicate 
that orthographic errors are more frequent 
than acoustic, associative or neutral errors 
for first-grade subjects, while for third- 
grade and college subjects this was not the 
case. This is interpreted as indicating that 
first-grade subjects are more likely to code 
a word by its orthographic features than 
are older subjects. The presence of an 
equal number of errors on the associative, 
acoustic and neutral choices by first-grade 
subjects indicates that the orthographic at- 
tribute may be the major attribute used by 
these subjects. The equal distribution of 
errors over the acoustic, associative and 
orthographic distractors for the third-grade 
subjects, coupled with the low incidence of 
errors on the neutral distractors, indicates 
that a single attribute may not be domi- 
nant in this age group. i 

These findings, taken in conjunction with 
those of Bach and Underwood (1970), who 
concluded that the acoustic attribute was 
dominant for second-grade subjects, but not 
for sixth-grade subjects, indicate that third- 
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grade subjects may be in a transition stage 
in which no attribute is dominant. How- 
ever, it is also possible that the acoustic 
attribute appeared dominant in the Bach 
and Underwood (1970) study because it 
was a combination of the orthographic and 
acoustic attributes, while in the present 
study these two were kept relatively in- 
dependent. 

Another finding not in accord with the 
previous evidence (Underwood & Freund, 
1968), is the lack of dominance of the as- 
sociative attribute in the college subjects. 
This may be due to either or both of two 
factors. First, there are changes in the as- 
sociation value of words with age (Ent- 
wisle, 1966) and the associative distractors 
were chosen from the first-grade norms, and 
thus may not have been representative of 
the associates elicited by college subjects. 
Second, the elicitation of some intralist as- 
sociations during the recognition test, caused 
some interference. For example, book oc- 
curred early in the recognition test. Read oc- 
curred later as a neutral distractor, and 
70% of the college subjects’ errors on this 
item were read. A conceptual category was 
formed by moon, star, and sun; space oc- 
curred near the end of the recognition test 
as a neutral distractor. On this choice, 58 % 
of the errors by college subjects were space. 
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IMPROVING CREATIVE THINKING BY TRAINING IN THE 
PRODUCTION AND/OR JUDGMENT OF SOLUTIONS: 


R. PAUL STRATTON? ann ROBERT BROWN 
University of Kentucky 


Creativity training programs that emphasize the production or judg- 
ment of solutions lead to conflicting results: Production training 
increases productivity but decreases quality, and judgment training 
increases quality but decreases productivity. Four conditions (pro- 
duction training, judgment training, combined training, and no- 
training control) replicated previous results with separate training 
procedures. Data also showed that these seemingly incompatable 
training procedures could be combined to increase productivity, 
solution quality, and judgment accuracy over separate training. 
Data were presented which suggested a model of problem solving 
that features formation of criteria for solution evaluation and recur- 
sive comparisons between the criteria and information for solving 
the problem and between the criteria and completed solutions. Judg- 
ment training affected all stages of problem solving by making such 
comparisons more accurate, hence, limiting the use of unprofitable 
information and eliminating poor solutions. Production training 


affected the idea-generating stage alone. 


That improved problem-solving skills 
should be the goal of educational endeavors 
is axiomatic. Controversy exists, however, 
in the means by which to attain that end, 
as evidenced by the proliferation of training 
programs for children and adults (cf. Seferian 
& Cole, 1970). Previous research indicates 
that two general approaches lead to con- 
flicting results: training in the evaluation of 
completed solutions (judgment training) 
increases quality but decreases productivity, 
and training in idea-generating techniques 
(production training) increases productivity 
but decreases quality. The present study 
compared the separate and combined effects 
of these seemingly incompatible forms of 
training, 

s Increased ideational fluency and flexibil- 
ity is the goal of a wide variety of training 
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programs and techniques (cf. Davis, 
Manske, & Train, 1967). Morphological syn- 
thesis, for example, is a technique that allows 
one to categorize information which may 
contribute to problem solutions and to sys- 
tematically combine this information into 
an almost infinite number of possible s0- 
lutions (Allen, 1962). In comparison to two 
other idea-generating techniques, Warren 
and Davis (1969) found increased produc- 
tivity and more superior solutions with the 
morphological synthesis technique. Further- 
more, this technique has been included in à 
large-scale training program for adolescents 
with favorable results (Davis, Houtman, 
Warren, & Roweton, 1969). Although mor- 
phological synthesis has never been use 
with the plot-title problem, and it is used in 
the present study, one would expect that m 
comparisons with a no-training control it 
would result in increased productivity (num- 
ber of solutions and production rate) and 
thus a large number of superior solutions. 
The second approach to training creative 
problem solving has produced the opposite 
end product; a high probability that any 
given solution will be of high quality. Sev 
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eral studies have compared nonevaluative 
(or brainstorming) instructions to instruc- 
tions that informed the problem solver of 
the criteria by which his solutions were to 
be evaluated. Such criteria-cued instructions 
generally resulted in a reduced productivity 
compared to nonevaluative instructions, but 
also produced a higher average quality and a 
higher percentage of superior solutions (e.g., 
Johnson, Parrott, & Stratton, 1968; 
Weisskopf-Joelson & Eliseo, 1961). 

1f criteria-cued instructions act to increase 
the problem solvers’ ability to evaluate 
their own solutions, increasing this judg- 
ment ability should increase average quality 
even more. An initial study by Johnson and 
Zerbolio (1964) was inconclusive because of 
a lack of an adequate judgment training ex- 
perience. Johnson et al. (1968) developed 
such a program and demonstrated that it 
effectively increased judgment ability in 
evaluating one’s own solutions and on a 
multiple-choice judgment test. 

The next question was whether judgment 
training would also influence solving a second 
problem in a transfer-of-training paradigm. 
Stratton, Parrott, and Johnson (1970) com- 
pared judgment training to criteria-cued 
instructions. For subjects requested to write 
many solutions, judgment training reduced 
productivity and increased average quality 
and the percentage of superior solutions. 
The quality data were replicated with sub- 
jects writing only one solution. In both ex- 
periments, only judgment training increased 
judgment accuracy. It is expected that, 
when compared with production training 
(Le, morphological synthesis), judgment 
training will reduce the number of solutions 
produced but will increase the average qual- 
ity and the percentage of superior solutions. 
, Combining judgment and production train- 
Ing represents the third training procedure 
tested in the present experiment. Production 
training should provide an idea-generating 
technique which, when combined with judg- 
ment training, will increase the total number 
of solutions produced compared to judg- 
Ment training alone and a control condition. 
Several studies (e.g., Gerlach, Schutz, Baker, 
& Mazer, 1964; Johnson, et al., 1968) have 
shown that such increased productivity also 


391 


leads to more solutions of very high quality. 
If judgment training were to teach subjects 
to accurately evaluate solutions such that 
only these very good solutions were recorded, 
combining production and judgment train- 
ing procedures would also result in enhanced 
overall quality, percentage of superior solu- 
tions, and judgment accuracy, compared to 
production training alone and a control con- 
dition. Thus, combined training should yield 
some advantages over each separate training 
procedure. 

Many studies have reported that learning 
is accelerated when instructional methods 
are tailored to students’ abilities (cf. Bracht, 
1970). It is especially important to deter- 
mine the most effective teaching method for 
students who exhibit poor initial perform- 
ance. It is, then, necessary to investigate 
individual differences when comparing differ- 
ent training procedures. In the present study 
ability level was defined in terms of perform- 
ance on a pretraining problem and judgment 
test. For example, when judgment accuracy 
was analyzed, scores on the pretraining judg- 
ment test were ranked and divided into high, 
medium, and low levels within each group. In 
this way, changes in performance between the 
pretraining and posttraining tests could be 
compared for different ability levels, as well 
as for different types of training. This con- 
forms to a 4 (training procedure) X 3 (abil- 
ity level) factorial design using difference 
scores as a dependent variable. 

In summary, the present study compared 
the relative advantages of three training 
procedures (production, judgment and com- 
bined) to a no-training control for three 
levels of initial ability. Comparisons were 
made on six dependent variables measuring 
aspects of productivity, solution quality, and 
judgment accuracy, each of which was as- 
sessed before and after training. 


MzTHOD 


Materials 

Each session consisted of three segments; pre- 
training plot-title problem and judgment test, 
training or filler task, and posttraining plot-title 
problem and judgment test. 

The plot-title problem is a complex problem 
which approximates a legitimate academic exer- 
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cise and demands divergent thinking (Guilford, 
1967). Two plot-title problems were used. The 
instructions for all subjects and both problems 
read, 


Below is the plot for a novel, movie, or play. 
Your task is to write as many clever titles for 
it as you can. By clever we mean an imagina- 
tive, creative, or unusual title for this plot. 
A good title, however, must also be appro- 
priate to the entire plot and the characters. 


"The judgment training program was a short- 
ened version of the one used by Johnson et al. 
(1968) and utilized the pretraining plot and solu- 
tions as a vehicle for training. The training pro- 
gram consisted of a description of the plot-title 
rating procedure and the criteria used by the 
judges. Examples and explanations were given for 
poor, mediocre, and superior titles, Practice in 
Selecting the best of three titles was supplemented 
by immediate feedback, and finally, subjects were 
given an opportunity to verbally state their own 
criteria for a good title. 

` The production training program was adapted 
to the plot-title problem from Allen’s (1962) de- 
scription of the morphological synthesis method 
of problem solving and, like the judgment train- 
ing program, utilized the first plot-title problem 
as an example of how to use the technique. A pilot 
study verified the effectiveness of this program in 
increasing the number of solutions over instruc- 
tions emphasizing productivity. This technique 
requires that subjects construct an idea table with 
one column for each major division, or major 
theme, in the story. Within each column, subjects 
then list all related information from the story. 
Thus, all information related to the plot is sys- 
tematically explicated and all possible com- 
binations of major and minor details can be ex- 
amined as potential titles. The training program 
consisted of an explanation of the technique, sev- 
eral examples of how it could be used, and practice 
in using it with the first plot-title problem. The 
last part of this training consisted of reading the 
plot for the second problem and making an idea 
table for it. ‘This was presented to the subjects as 
further practice in constructing idea tables, and 
there was no indication that this would be the 
next, ptt Ms these subjects worked on the 
second problem, however, they were i 
to use this table. BA CA diri 

Combined training consisted of both of the 
above training procedures with no attempt to 
abbreviate or integrate the Separate training pro- 
grams, 

Control subjects worked on a filler task which 
was assumed to have no effect on the plot-title 
problem. It consisted of rating adjective-noun 
association strength and writing sentences using 
specified adjectives and nouns. 

Assessing the subjects’ judgment ability is 
important in determining the success of judgment 
training. A measure of judgment accuracy which 
has practical importance is subjects’ ability to 
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evaluate their own solutions, specifically the abil- 
ity to identify their best solutions. Any score for 
evaluating one’s own solutions, however, is de- 
pendent upon the quality of the titles judged. If 
subjects were to produce only good titles, as is the 
case after judgment training, the discrimination 
would be more difficult than if the subjects had 
produced some poor and some good titles. Fur- 
thermore, the measure of the accuracy of the sub- 
jects’ judgments, the difference in quality between 
the selected title and the rest, would be less in the 
former case, even with perfect judgment. Thus, 
it is necessary to assess judgment accuracy inde- 
pendently of the plot titles produced. This was 
done in the present experiment by using ten-item 
multiple-choice tests that consisted of titles 
written by other subjects which had been judged 
previously for quality. The task was to select the 
one title out of five for each item which ‘most 
closely matches your criteria for a superior title 
for this plot.” A similar testing procedure was 
used by Johnson et al. (1968) and Stratton, et al. 
(1970) and was found to discriminate between 
naive subjects and subjects with judgment train- 
ing. 


Procedure 


The first plot-title problem and judgment test 
was used to ascertain an initial level of ability. 
All subjects were given 7 minutes to write as many 
titles as possible. Then the subjects were given 10 
minutes to work through the multiple-choice judg- 
ment test. . 

Booklets were collected and the training pro- 
grams were administered. All subjects trained in- 
dividually and were finished in 20 or 40 (combined 
training) minutes. Production training and judg- 
ment training subjects received the training pro- 
grams described above. The subjects receiving t 
combined training received the same judgment 
and production training booklets, but half e 
received the judgment training booklet first an 
the other half (22) received the production tram- 
ing booklet first. (Comparisons on all variables 
revealed no differences between presentation Or- 
ders; all (s less than 1.0. Thus, data for both or- 
ders were combined for all further analyses.) eel 
trol subjects received a neutral filler task whic 
required the same amount of time. d 

After all of the subjects had complete 
the training programs, the booklets were collected: 
Production and combined training subjects Te. 
tained the idea table they had made for the secon 
problem. A third booklet was then distribu a 
It contained the second plot-title problem, t 
multiple-choice judgment test, and last, unrelatet 
problems. To investigate the motivational aspec d 
of the training programs, subjects were lp 
an unlimited amount of time to go through t 
booklet. The proctor had subjects record elap 
time by underlining the title they were working 
at the end of each 1-minute period. When a subj i 
had produced no titles in 3 minutes, he was 
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structed to turn to the judgment test. Then, sub- 
jects worked on unrelated problems until everyone 
had completed the judgment test. This prevented 
anyone from leaving the room early. 


Subjects 


Introductory psychology students at the Uni- 
versity of Kentucky who represented all levels of 
college training volunteered for this experiment 
as a course requirement. Of the 196 participants 6 
subjects were rejected, some from each group, 
because of a failure to complete all of the mate- 
rials, and 10 were randomly eliminated to form 
four treatment groups of 45 subjects each. Four- 
teen groups of from 10 to 20 subjects were tested, 
and treatments were serially assigned to groups. 


RzsurTS 


Two experienced judges independently 
judged the quality of each solution on a 1 
(bad) to 7 (good) seale based on cleverness 
and appropriateness. Interjudge reliability 
was .85 and .87 for the two problems, re- 
spectively. To remove judgment biases due 
to halo effects, penmanship, and knowledge 
of group membership, each solution was 
typed on separate 3 X 5 cards and coded on 
the reverse side for group, subject, problem, 
and solution numbers. The cards were then 
shuffled and presented to the judges who in- 
dependently rated solution quality. To 
insure independence of the judgments, the 
first judge recorded his rating on the back 
of the card, and the second judge recorded 
his on the front, then on the back. 

Each subject received scores for six 
dependent variables on the first and second 
problems. The change in performance be- 
tween problems was determined for each 
dependent variable by subtracting the first 
Problem score from the second problem 
Score; that is, positive difference scores in- 
dicated an improvement from the first to the 
second problem. These data are presented in 
Table 1. The six dependent variables are 
defined as follows: judgment accuracy— 
number correct on the multiple-choice judg- 
ment test; overall quality—mean pooled 
quality rating (range 2-14) of all solutions 
Or each subject; number superior-number 
of solutions receiving a quality rating above 
the nintieth percentile for all solutions ob- 
ained from a given problem; number of solu- 
tions—the number of solutions produced for 
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a problem; time—time spent working on the 
second problem (subjects worked for 7 min- 
utes on the first problem but time varied on 
the second); production rate—number of 
solutions produced per minute. 

Time on the second problem was analyzed 
with a single-factor analysis of variance, 
since subjects differed only on the second 
problem. Analyses of variance for all other 
dependent variables conformed to a 4 (train- 
ing procedure) X 3 (ability level) factorial 
design (Lindquist, 1953). 

Table 1 presents the difference score means 
for five dependent variables and mean time 
spent on the second problem. Obtained F 
values for training effects are above 4.64 for 
all variables and significant beyond the .01 
level with dfs of 3/168 and 3/176. (Analyses 
of covariance with ability as covariate also 
gave F values significant beyond the .01 
level for all variables, except time for which 
that analysis is inappropriate.) Comparisons 
of ability levels for all variables—except 
number of solutions (F < 1)—yield F values 
above 9.10 which are significant beyond the 
.01 level (df = 2/168). The interaction be- 
tween training and ability level is not signifi- 
cant for any variable; the F values do not 
exceed 2.23 (df = 6/168). Subsequent 
planned comparisons were made with New- 
man-Keuls tests, and differences signifi- 
cant beyond the .05 level are reported below 
and are indicated in Table 1. 

The subjects who received judgment train- 
ing became better judges of solution quality 
and produced solutions of higher mean qual- 
ity than subjects receiving production train- 
ing along or no training. Both forms of judg- 
ment training produced results on these 
measures which were equally superior to the 
performance of subjects not receiving judg- 
ment training, and the latter subjects per- 
formed about the same. 

The number of superior solutions, a second 
measure of solution quality, reveals that all 
trained subjects produced more superior 
solutions than untrained subjects, and train- 
ing procedures cannot be statistically dis- 
tinguished from one another. Increased pro- 
duction of superior solutions, however, can 
be a consequence of increased productivity 
as well as increased judgment accuracy 
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(Johnson et al., 1968). Taking the percentage 
of solutions that are superior cancels out 
the effect of quantity leaving only the effect 
of judgment accuracy. The control (7 76) and 
production training (4%) subjects produced 
a lower percentage of superior solutions than 
combined training (14%) and judgment 
training (17%) subjects. This implies that 
subjects with production training alone gen- 
erated superior solutions by virtue of their 
fluency, and subjects with either form of 
judgment training produced superior solu- 
tions by virtue of their selectivity. 

Correlational data further verify the rela- 
tionship between judgment accuracy and 
quality measures. On the second problem, 
where subjects varied widely in judgment 
accuracy, the number of correct choices on 
the judgment test was significantly corre- 
lated with mean quality (r = .27, p < .01) 
and number of superior solutions (r = .28, 
p < .01) with dfs of 178, Measures of pro- 
ductivity, however, are negatively corre- 
lated with mean quality (number of solutions, 
r= —.26, p < Ol; time, r = —.12,p <.1; 
production rate, r = —.29 p < .01) with dfs 
of 178. Thus, comparisons between individ- 
ual subjects corroborate the conclusions 
based on group comparisons that mean qual- 
ity is positively related to judgment accuracy 
and negatively related to productivity. 

Comparisons on productivity measures 
(number of solutions, time, and production 
rate) reveal the benefits of production train- 
ing. The subjects with production training 
alone used more time, produced more solu- 
tions, and produced solutions at a faster rate 
than other subjects. Production training 
combined with judgment training increased 
the quantity of solutions above that for sub- 
jects with no training principally by in- 
creasing the working time, as the rate of 
production was lower than that for no-train- 
ing subjects. Judgment training alone de- 
creased productivity compared to both forms 
of production training. The subjects with 
judgment training alone wrote as few solu- 
tions as control subjects but took more time 
for each solution. 

All dependent variables, except number 
of solutions, yielded a significant effect due 
to initial ability level. In every case the 
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greatest improvement came from subjects 
lowest in initial ability and the least im- 
provement from those highest in ability. 
The lack of a significant interaction between 
training procedure and ability indicated 
that with every training procedure the 
amount of improvement was inversely re- 
lated to initial ability level, even though 
average gains differed between types of 
training. 
Discussion 


These results suggest that production and 
judgment training procedures offer unique 
benefits for improving creative thinking. 
Production training increased productivity 
but left quality unchanged relative to con- 
trol procedures. In addition, these subjects 
could evaluate the quality of solutions no 
better than naive subjects. On the other 
hand, judgment training increased quality 
but at the sacrifice of productivity. These 
subjects could evaluate solution quality 
very accurately. The results for combined 
training have important theoretical and 
practical implications; combined training 
produced higher quality solutions than pro- 
duction training alone and more solutions 
than judgment training alone. 

Production training with the morpho- 
logical synthesis technique used separately 
or combined with judgment training in- 
creased productivity. But, when production 
training was used alone, quality and judg- 
ment accuracy were no better than with no 
training. The morphological synthesis tech- 
nique provides the framework within which 
to analyze information in the plot and to 
synthesize such information to form plot 
titles. For example, over 12,000 combina- 
tions of ideas could evolve from an idea 
table with four categories and four entries 
in each category. Increased productivity 
could result from mechanical transcription 
of titles from the idea table to the answer 
sheet. It is also likely that motivation was 
enhanced due to the rewarding nature of 
increased fluency and the novelty of some 
of the combinations. 

Judgment training used alone reduced 
productivity, but both forms of judgment 
training increased overall quality, the per- 


396 


centage of superior solutions and judgment 
accuracy. Thus, the probability of any single 
solution being of high quality was increased. 
Judgment accuracy was found to be posi- 
tively related to measures of quality in com- 
parisons between training groups and in 
correlations. It is reasonable to expect, then, 
that subjects with judgment training could 
have accurately evaluated their own solu- 
tions and eliminated the poor ones. This 
could account for the increased quality and 
slow rate of production after judgment train- 
ing, 

To assert, however, that subjects evaluate 
only completed solutions is to assume that 
the problem solver is unresponsive to judg- 
mental instructions and skills until after 
solution emission. Recent studies have 
shown that subjects with directing instruc- 
tions selectively attend to and retain infor- 
mation in a text. Thus, information that is 
consistent with instructions is more avail- 
able for use (e.g., in solving a problem) than 
information which is incidental to the in- 
structions (Frase, 1969, 1970; Frase & Sil- 
biger, 1970). In the plot-title problem, 
judgment training could instruct subjects to 
focus on information matching the criteria 
for solution evaluation, for example, humor- 
ous events, 

Evidence for such presolution evaluation 
was obtained from the idea tables con- 
structed by subjects with production train- 
ing alone (n = 45) and subjects with judg- 
ment training prior to production training 
(n= 23). If subjects with judgment training 
selectively utilized plot information in con. 
structing idea tables, their idea tables 
should contain fewer entries, fewer poor 
ideas, and more good ideas. The data con- 
firmed these predictions. Judgment training 
received before production training reduced 
the mean number of entries in the idea 
tables (15.90 vs. 17.22), decreased the mean 
number of poor ideas (4.10 vs. 5.88) and 
slightly increased the mean number of good 
ideas (3.30. vs. 3.14). The advantage of 
Judgment training for information selection 
while gathering information for problem 
solving would have been enhanced were 
memory load increased, for example, by 
lengthening the plot, presenting the plot 
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for a shorter period of time, or removing the 
plot before solution construction. In the 
present study the plots were very short 
(under 100 words) and were always avail- 
able. 

Process analyses of problem solving (e.g, 
Johnson & Jennings, 1963) reveal three 
classes of activities while problem solving; 
preparation, production, and judgment. 
Preparatory activities include the definition 
of the discrepancy between the desired situ- 
ation and the present situation. To define 
the desired situation it is necessary to con- 
struct criteria for the acceptance of a solu- 
tion. The present data suggest that such 
eriteria may be used to select information 
contributing to problem solutions and to 
evaluate completed solutions. Such recursive 
operations should continue until a match is 
made between the criteria and the informa- 
tion or between the criteria and a solution. 
In other words, activities that lead to the 
formation of accurate criteria will improve 
problem solving from the information search 
Stage to the solution evaluation stage. . 

The implications for education are evi- 
dent. It is most desirable for training pro- 
grams in ereative problem solving to dem- 
onstrate transfer from training activities to 
classroom activities. Such transfer would be 
enhanced if students were instructed in the 
methods of acquiring the criteria for solution 
evaluation such as question asking, library 
search, etc., as well as idea-generating tech- 
niques such as morphological synthesis. 
Knowledge of such criteria would give prob- 
lem solvers a well-defined goal and a means 
by which to evaluate their own ideas, rather 
than relying on someone else for evaluation. 
Criterial information, content information, 
and idea-generating techniques could be 
learned if problem solving with class con- 
tent were an everyday activity. 

The training procedures used in the pres- 
ent study represent classes of training pro- 
cedures stressing either production or judg- 
ment aspects of the problem-solving process. 
Evaluation of the utility of similar programs 
in industry or education must be guided by 
the problem solving situation and the de 
sired outcome. If sheer quantity of ideas 8 
desirable, for example from a group of 1€ 
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luctant salesmen, nonevaluative production 
training procedures may be more applicable. 
This would be especially true if an outsider 
were to do the final evaluation and the addi- 
tional time for external evaluation were of 
no consequence. On the other hand, when 
self-evaluation of some sort is necessary, 
such as when the cost due to a poor solution 
is great or only one solution is required, then 
a form of judgment training would be war- 
ranted, In other words, a form of production 
training gets more ideas out of reluctant con- 
tributors while a form of judgment training 
increases the probability of the success of 
each solution. The practical consequence of 
the combined training is that both produc- 
tivity and probability of success can be in- 
creased. 

These data also have implications for the 
assessment of creative thinking abilities. 
Consistent with Guilford’s (1967) structure 
of intellect model, the present data demon- 
strate that evaluation and production are 
separate but interdependent abilities. While 
divergent thinking abilities such as fluency 
are important in creative problem solving, 
the evaluative aspects cannot be overlooked. 
Overall solution quality, percentage of su- 
perior solutions, and judgment accuracy 
Measures represent important aspects of 
creative thinking, and reflect the ability of 
the individual to function autonomously in 
selecting information for solving problems 
and in evaluating final solutions. Moreover, 
the same evaluative skills should play a part 
in selecting the very problems to be solved. 

Nhether evaluative abilities represent spe- 
cific information or attitudes cannot be de- 
termined at the present, but both can be 
assessed and should predict other aspects of 
creative behavior. 
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PERSONALITY AND IQ MEASURES AS PREDICTORS OF 


SCHOOL ACHIEVEMENT 


K. BARTON; T. E. DIELMAN, ax» R. B. CATTELL 
University of Illinois 


One hundred and sixty-nine sixth graders and 142 Seventh graders 
were given the High School Personality Questionnaire, the Culture 
Fair Intelligence Test, and then standardized achievement tests 2 
months later, Regression analyses were performed to predict achieve- 
ment from the personality and IQ variables. Conclusions were: 
(a) the personality factor “conscientiousness” and IQ predicted 
achievement in all areas (p > 01); (b) grade-specific factors were 
important. (For example, "warmh " predicted achieve- 
ment in all areas in the sixth grade only [p > .01]); and (c) certain 
specific achievement areas had their own unique set of predictor 
variables. For example, in mathematics; “adyenturousness” was re- 


lated to achievement in both grades (p < 01). 


Many studies (Edwards & Tyler, 1965; 
Shinn, 1956; Wellman, 1957; Wolking, 
1955) have related measures of IQ to meas- 
ures of achievement in school. On the aver- 
age, such studies indicated a correlation of 
approximately .70 between IQ and achieve- 
ment. As this figure shows that only about 
50% of the variance in achievement scores 
can be accounted for in terms of IQ, several 
investigators looked elsewhere for other de- 
termining variables. Thus, in the 1940s and 
1950s, attempts were made to link the pre- 
diction of achievement with personality 
variables. At first this attempt did not seem 
to meet with much success, Middleton and 
Guthrie (1959) summarized the results of 
many studies and concluded that "the prin- 
cipal difficulty is probably the heterogeneity 
of the criterion, the antiquity of the person- 
ality measures used and the nonsummative 
or non-linear predictions [p. 66]." 

In the last 10 years or 80, however, results 
have been more consistent and there is now 
quite a substantial amount of literature re- 
lating personality factors to school achieve- 
ment. Warburton (1961; 1962a; 1962b) re- 


* Requests for reprints should be sent to Keith 
Barton, Department of Psychology, University of 
Illinois, 525 Psychology Building, Champaign, Il- 
linois 61820. 


viewed this area in detail. More recent work 
has been done by Cattell and Butcher 
(1968) and it is this work that provided the 
major impetus for the present study. Cattell 
and Butcher attempted to predict both 
school achievement and creativity from 
ability, personality, and motivation meas- 
ures. The authors met with considerable 
success in indicating the importance of per- 
sonality factors in school achievement and 
showed that the addition of such personality 
measures in the prediction equations re- 
sulted in significantly greater multiple Rs 
than when ability measures alone were used. 
Tn the attempt to link motivation measures 
to school achievement, howeyer, results were 
disappointing on the whole. The correla- 
tions that did exist between motivation 
measures and achievement were generally 
low and insignificant. 

The present study was designed to assess 
more fully the relative importance of ability 
and personality variables in the prediction 
of school achievement in a variety of areas. 
The specific hypotheses to be tested were 
(a) by using a combination of intelligence 
and personality variables to predict schoo 
achievement, better predictions would be 
obtained than when intelligence variables 
were used alone; and (b) achievement in 
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different subject areas will be predicted best 
by aset of personality and intelligence vari- 
ables unique to each area. 


Mernop 


Subjects 


The subjects in the study consisted of 169 sixth- 
grade and 142 seventh-grade students enrolled in 
Woodrow Wilson Junior High School in Decatur, 
Illinois. Approximately an equal number of boys 
and girls were involved, Socioeconomic status 
ranged from upper-middle class to lower class in 
about equal proportions. 


Measures 


All subjects completed the Culture Fair Intelli- 
gence Test and the High School Personality Ques- 
tionnaire, in January, 1970. In addition, four stand- 
ardized achievement tests (Educational Testing 
Service) in the areas of mathematics, science, so- 
cial studies, and reading were given in March 
1970, thus allowing a 2-month interval between the 
predictors and the criteria. ! 

The High School Personality Questionnaire 
mensures a set of 14 factorially independent di- 
mensions of personality. A brief description of 
| these factors is shown in Table 1. 


Procedure 


All tests were computer-scored and correlations 
obtained among the personality, ability, and 
achievement variables? Multiple-regression analy- 
8s were performed to obtain measures of effective- 
ness of each variable and different combinations of 
variables in the prediction of school achievement. 
ds are shown for each achievement test in Ta- 

e 2 and 3. The correlations of individual per- 

sonality and IQ variables with the achievement 
Scores are shown in Table 4. 
3 The regression analyses shown in Tables 2 and 
3 were designed to test several hyoptheses relat- 
ak, the predictive value of different intelligence 
ind personality variables in the area of school 
achievement. 


go edel 1 (Culture Fair Intelligence Test + High 
F 100] Personality Questionnaire versus Culture 
wit Intelligence Test) was designed to indicate 
ni ether an increase in prediction of achievement 
oe when the whole High School Personality 


a 

The correlations among the personality and 
Culture Fair Tatelligence Test measures for each 
Naps may be obtained by ordering documerit 
eats O1794, from ASIS-National Auxiliary Pub- 
tio tions Service, c/o CCM-Information Corpora- 
1002 866 Third Avenue, 
p 2; remitting $2 for 
‘or each photocopy. 


New York, New York 
each microfiche or $5 
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TABLE 1 


PERSONALITY Factors MEASURED BY THE HIGH 
SCHOOL PERSONALITY QUESTIONNAIRE 


i3 
Low-score description | designa-| High-score description 
tion of 
factor 
Reserved A Warmhearted 
Low intelligence High intelligence 
Emotionally un- 
stable Cc Emotionally stable 
Undemonstrative D Excitability 
Submissive E Dominance 
Desurgency F Surgency 
Weak superego G Strong superego 
Shy H Socially bold 
Toughminded I Tenderminded 
Zestful J Reflective 
Self-assured oO Apprehensive 
Group dependency Q: | Self-sufficiency 
Uncontrolled Qa | Controlled 
Relaxed Qı | Tense 


a aŘŘŮŘŮĖŮ———————— 


Questionnaire was added to the Culture Fair In- 
telligence Test results. 

Model 2 (Culture Fair Intelligence Test + High 
School Personality Questionnaire versus Culture 
Fair Intelligence Test + B) indicated whether the 
addition of just the nonintelligence factors in the 
High School Personality Questionnaire increased 
achievement prediction over and above that ob- 
tained when both the Culture Fair Intelligence 
Test and Factor B were used. 

Model 3 ([Culture Fair Intelligence Test + 
(High School Personality Questionnaire — B)] ver- 
sus Culture Fair Intelligence Test) allowed for the 
assessment of the nonintelligence factors in the 
High School Personality Questionnaire over and 
above the effectiveness of the Culture Fair Intel- 
ligence Test alone. 1 

Model 4 (Culture Fair Intelligence Test + B 
versus Culture Fair Intelligence Test) showed the 
effectiveness of adding Factor B to the only other 
intelligence measure, the Culture Fair Intelligence 


Test. 

Models 5, 6, 7, and 8 indicated, respectively, 
the effectiveness in prediction of achievement of 
the High School Personality Questionnaire alone, 
the High School Personality Questionnaire non- 
intelligence factors alone, Factor B alone, and the 


Culture Fair Intelligence Test alone. 
RESULTS 


Before examining the regression analyses 
results, several inferences can be made from 
the correlations shown in Table 4. It should 
be noted that the correlations between any 
given personality factor and achievement 
score are of the same general magnitude and 
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PERSONALITY AND IQ MEASURES AS PREDICTORS 


correlations). Factors that are significantly 
related (p < .01) to all four measures of 
achievement in both grades are the two 
meaures of IQ (Factor B of the High 
School Personality Questionnaire and the 
Culture Fair Intelligence Test) and Factor 
G of the High School Personality Question- 
naire, a measure of conscientiousness. 

In the sixth grade, Factor A (warm- 
hearted participation) is significantly re- 
| lated to all four measures of achievement 
(p < 01). In addition, Factor H (adven- 
turousness) is significantly related to 
achievement in mathematics (p < .05) and 
| Factor Q, (anxiety) is significantly related 
(p € 01) to social studies. 
| Inthe seventh grade, Factor C (emotional 

stability) is significantly correlated with 
mathematics and reading (p < .01) and 
with social studies and science (p < .05). 
Factor E (dominance) is significantly re- 
lated to mathematics (p < .01). Once again, 
as in the sixth grade, Factor H is signifi- 
cantly related to mathematics (p < .05). 
| Factor I (toughmindedness) is significantly 
telated to mathematics (p < .01) and sci- 
ence (p < .05), and Factor J (desire for 
group action) is significantly related to 
mathematics and science (p < .01) and to 
social studies (p < .05). Factor O (self-as- 
. Suredness) is significantly related to all 

achievement measures as is Factor Qs (ex- 
acting willpower). 

In both the sixth and seventh grades, 
Model 1 indicates that the addition of the 
High School Personality Questionnaire to 
the Culture Fair Intelligence Test results in 
a significant increase (p < .01) in the 
amount of variance accounted for in 
achievement in all four areas: many 
cases this involves a doubling of the vari- 
ance. Model 2 indicates that in the sixth 
grade, adding the nonintelligence factors of 
the High School Personality Questionnaire 
to the Culture Fair Intelligence Test and 
Factor B results in increased power of pre- 
diction for reading only (p < .05). How- 
ever, at the seventh grade this is true fpr 
both reading and mathematics (p < .05). 
Model 3 indicates that even if only the non- 


d 
sign over all four types of achievement tests 
within any grade (ignoring nonsignificant 
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intelligence factors are added to the Culture 
Fair Intelligence Test a significant increase 
(p < .05) in variance is obtained in the 
sixth grade but this is only true for mathe- 
matics and reading in the seventh grade. 
The significance of just using the nonintelli- 
gence factors of the High School Personality 
Questionnaire alone is given by Model 6, 
and the tables show that in both grades and 
in all achievement areas this is significant 
(p < .05). The effectiveness of the whole 
High School Personality Questionnaire is 
revealed in Model 5 and one can see that it 
is highly significant in both grades and for 
all subjects (p « .001). Models 7 and 8 
show that in all grades and subjects each of 
the two measures of intelligence signifi- 
cantly predict achievement, (p « .001), and 
it is interesting to note the Culture Fair In- 
telligence Test is superior to Factor B in 
the seventh grade for mathematics, whereas 
Factor B is preferable in the sixth grade 
for both social studies and science. Model 4 
demonstrates that adding Factor B to the 
Culture Fair Intelligence Test results in a 
significant increase (p « .001) in prediction 
for all subjects and grades. This result is not 
too surprising as the Culture Fair Intelli- 
gence Test was designed to measure a differ- 
ent form of intelligence (fluid or innate) 
than was Factor B (a more crystalline 
measure influenced by environmental fac- 
tors). 

Finally, from the regression analyses, 
several conclusions can be made regarding 
which tests a teacher or counselor should 
use to best predict achievement in social 
studies, science, math, or reading. 


Social Studies 

The results indicate that measures of 
both IQ and personality are superior to 
either type of measure taken singly. How- 
ever, a comparison between Models 1 and 5 
indicates that the High School Personality 
Questionnaire taken alone (as it contains a 
measure of intelligence) is as effective as 
when it is taken in addition to the Culture 
Fair Intelligence Test. This is true for both 
sixth and seventh grades. 
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PERSONALITY AND IQ MEASURES AS PREDICTORS 


Science 

The results are essentially the same as for 
social studies in both grades, that is, the 
High School Personality Questionnaire pro- 
vides a very adequate prediction instrument 
without the addition of the Culture Fair In- 
telligence Test. 


Mathematics 


Here the Culture Fair Intelligence Test 
measure considerably improves in its pre- 
dictiye value in both grades but personality 
variables (Model 6) are very important es- 
pecially in the seventh grade. Model 1 com- 
pared to Model 5 indicates that in this area 
the best prediction of achievement can be 
obtained by using both the High School Per- 
sonality Questionnaire and Culture Fair In- 
telligence Test measures together. 


Reading 


TThe results are essentially the same as for 
mathematics, that is, IQ and personality 
measures are both important and the Cul- 
ture Fair Intelligence Test + High School 
Personality Questionnaire provides for the 


[ best prediction of achievement. 


All in all, the results of this experiment 
are compatible with those found by Cattell 
and Butcher (1968). Intelligence quotient 
generally seems to account for approxi- 
mately 20%-30% of the variance in achieve- 
Ment scores but the addition of personality 
measures doubles this amount of the pre- 
diction. 


Cross-Validation Analysis 


As a cross-validation check, the sixth- 
grade beta weights were applied to the sev- 
enth-grade data, and vice versa. The result- 
ant R? values are shown in Table 5. As can 
he seen in this table the original R? values 
and those obtained from the beta weights 
from another grade are very similar in mag- 
nitude for any given achievement area. The 
regression equations are given in Table 


CONCLUSIONS 
From the correlational analyses, a per- 
sonality description of high and low achiev- 
ers in the four areas.of social studies, sci- 
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TABLE 5 
Cnoss-VALIDATION ANALYSIS 


Sixth-grade data |Seventh-grade data 


Criteria ur pS 
Ri grade Ri grade 
(original)| data) F |(original)| data) F 
beta beta 
weights weights 
Social studies .42 44 -38 .55 
Science -37 .95 44 .51 
Mathematics 53 .49 .55 .68 
Reading .50 .49 .59 .74 


ence, mathematies, and reading can be ob- 
tained. 

In the sixth and seventh grades it appears 
that the high achiever, in all the above four 
areas, is intelligent, (Factor B and the Cul- 
ture Fair Intelligence Test) and conscien- 
tious (Factor G). In addition, in the sixth 
grade, warmheartedness (Factor A) is also 
related to all four areas of achievement. In 
mathematics, it also helps if the child is ad- 
venturous (Factor H), and in social studies 
if he is a little anxious (Q4). 

In the seventh grade it appears that Fac- 
tor A no longer plays such a large part in 
achievement but emotional stability (Factor 
C), the desire for group action (Factor J), 
self-assuredness (Factor O), and strong 
willpower (Factor Qs) now are significantly 
related to achievement in all areas. In sci- 
ence and mathematics the high achievers are 
toughminded (Factor I) and it helps in 
mathematics if their dominance (Factor E) 
scores are high. 

In brief, it would seem that three main 
conclusions can be made in relating per- 


5 sonality to achievement in the sixth and 


seventh grades. 

1. There does seem to be a general con- 
cept of achievement which is consistently 
related to a set of personality and intelli- 


gence measures over all four achievement 


areas. This set of variables includes IQ, and 
Factors B and G. 

2. There are certain personality factors 
which follow a developmental sequence in 
their relationship to achievement. For ex- 
ample, in the sixth grade, Factor A is im- 
portant but in the seventh grade it is not. In 
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TABLE 6 


REGRESSION EQUATIONS 
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Social studies 
Mathematics 


Reading 


Science 


Note.—Decimal points have been omitted. 


Grade 7, Factors C, J, O, and Qs become 
important although they were not in the 
sixth grade. 

3. Some personality factors are specifi- 
eally related to individual areas. Thus, in 
both the sixth and seventh grades, Factor H 
is related to achievement in mathematics. 


REFERENCES 


Carrer, R. B., & Burcmer, H. J. The prediction 
of achievement and creativity. Bobbs-Merrill, 
1968. 

Epwanps, M. P., & Tvrzn, L. E. Intelligence, cre- 
ativity and achievement in a non-selective public 
junior high school. Journal of Educational Psy- 
chology, 1965, 56, 96-99. 

, MippreroN, G., & Gurrm, G. M. Personality syn- 
dromes and academic achievement. Journal of 
Educational Psychology, 1959, 60, 66-69. 

Sumy, E. O. Interest and intelligence as related to 
achievement in tenth grade. California Journal 
of Educational Research, 1956, 7, 217-220. 

. Wansurtoy, F. W. The measurement of personal- 
ity. I. Educational Research, 1961, 4, 2-18. 

* Warsurton, F. W. The measurement of personal- 
ity. II. Educational Research, 1962, 4, 115-132. 
(a) 

. Wansurtoy, F. W. The measurement of personal- 
ity. III. Educational Research, 1962, 4, 193-206. 
(b) ^ 

+ Wertman, F. E. Differential prediction of high 
school achievement using single score and multi- 
ple factor tests of mental maturity. Personnel 
and Guidance Journal, 1957, 35, 512-517. 

eWorxiNc, W. D. Predicting academic achievement 

, with the differential aptitude and P.M.A. tests. 

! Journal of Applied Psychology, 1955, 39, 115- 
118. 


(Received January 21, 1971) 


(rete) 


JOURNAL OF |. 
EDUCATIONAL PSYCH 


LOGY 


A E 


October 1972 


Volume 63 


CONTENTS 


Effects of Different Sources of Positive and Negative Information on Observa- 
tional Learning of a Teaching Skill 405 


JOHN J. KORAN JR., MARY LOU KORAN, AND FREDERICK J. MCDONALD 
Achievement as a Function of Lan e Competence, Behavior Adjustment, 
and Sex in Young, Disadvantaged Mexican-American Children............. 411 
JAMES M. STEDMAN AND RUSSELL L. ADAMS 
Perceived Reward Value of Teacher Reinforcement and Attitude toward Teacher: 
An Application of Newcomb’s Balance Theory. ....... eee 418 
DEWITT C. DAVISON 
Correlation of Paired-Associate Performance with School Achievement as a 
Function of Task Abstractness. ....... esee 
| DAVID H. FELDMAN, LEE ELLEN JOHNSON, AND TERRILL A. MAST 
Imaginal Facilitation of Paired-Associate Learning: A Limited Generalization.. 429 
JOEL R. LEVIN AND SANDRA A. KAPLAN 
Developmental Patterns in. Elemental Reading Skills: Phoneme-Grapheme and 
Grapheme-Phoneme Correspondences. ...... esee 433 
MADELINE HARDY, P. C. SMYTHE, R. G. STENNETT, AND R. H. WILSON 


l A Behavioral Indez of the Exploratory Value of Prose Materials..........+.. 497 
LARRY T. BROWN 
Recall of Place on the Page. |... 446 


EUGENE B. ZECHMEISTER AND JACK MCKILLIP z 
Grade Expectations, Differential Teacher Comments, and Student Performance. . 454 
BERNARD HAMMER r 


Interactive Relationship between Inquisitiveness and Student Control of In- 
é Biruction. ... cone ce nne eisenos soa KEEPS T Ea ep ene eSa AEAN AR 459 


HAROLD F. O'NEIL, JR. 


The First Letter Mnemonic 
DOUGLAS L. NELSON AND CYNTHIA STARK ARCHER 

Differentiation of Grapheme-Phoneme Units as a Function of Orthography...» . 
PETER R. OLIVER, JACQUELYN M. NELSON, AND JOHN DOWNING 

Effects of Comparison Level Feedback on Classroom-Related Verbal Learning xd 

Performances: As es aep dapes aee eee eis apes EEE STE? 

C. R. SNYDER 2 

Instructional Specificity and Oui 


Question Formulation... ..... en 
TED L. ROSENTHAL AND BARRY J. ZIMMERMAN 


Delay-Retention Effect with Multiple-Choice Weg odes cee et ede N 
RAYMOND W. KULHAVY AND RICHARD C. ANDERSON 


487 


icome Expectation in Observationally Induced dpi 


—— 


© 1972 by the American Psychological Association, Inc. 


NOTICE TO AUTHORS 


In response to a request from the APA Publications Board, the Journal 
of Educational Psychology has agreed to participate in a blind reviewing 
system. An effort will be made to insure that no information regarding 
the identity of an author or of the institution with which the author is 
affiliated be afforded to reviewers. 

Authors submitting manuscripts to the Journal of Educational Psychol- 
ogy are requested to include with each copy a cover sheet containing the 
title of the manuscript, the name of the author or authors, the author’s in- 
stitutional affiliation, and the date the manuscript was submitted. The first 
page of the manuscript should also state the title and the date it was sub- 
mitted, but no author names or affiliations. Footnotes pertaining to the 
identity of the author or his institutional affiliation should be on a separate 
page. Every effort should be made by the author to see that the manuscript 
itself contains no clues as to his identity. 

Authors are requested to use the Proposed format immediately. Manu- 
scripts that do not conform to these Suggestions will be returned. 


| 
| 
| 


Journal of Educational Psycholog 
05-411 x 


1972, Vol. 63, No. 5, 4 


EFFECTS OF DIFFERENT SOURCES OF POSITIVE AND 
GATIVE INFORMATION ON OBSERVATIONAL 
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LEARNING OF A TEACHING SKILL’ 
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University of Florida 
FREDERICK J. McDONALD 
Educational Testing Service, Princeton, New Jersey 


Inan experiment designed to test the effects of different sources and 
types of information on observational learning of a teaching skill, 78 
subjects were randomly assigned to one of six treatment groups em- 
ploying three sources of model content (teacher behavior, student 
behavior, or teacher-student interaction) and two types of informa- 
tion content (positive and negative examples). Dependent variables 
were frequency and variety of categories of scientific-process-eliciting 
questions generated by subjects in two microteaching sessions. A 
3 X 2 X 2 repeated measures analysis of variance showed that 
teacher-student interaction and student behavior models produced 
significantly more process questions than did the teacher behavior 
models. Moreover, positive models produced a significantly greater 
increase in the variety of categories of process questions than did 


negative models. 


The use of modeling procedures as a means 
of influencing human behavior has been 
Well documented (Bandura, 1965; Bandura 
& Walters, 1963). This research suggests 
that complex behavior may be acquired or 
Modified through observation with no direct 
external reinforcement. Modeling procedures 
have been found to be more effective than 
Operant conditioning in transmitting new 
lesponse patterns (Bandura & McDonald, 
1963), with the provision of a model alone 
emg as effective as the combination of 
Modeling and reinforcement for initial learn- 
Ing. Other research has shown that film- 
ee models have been as effective as 
B. models in producing behavior change 
(Bandura, Ross, & Ross, 1963). 
ponis article was based on a doctoral disserta- 
ford cae by the first author while at Stan- 
8t versity. The research was funded by 

anford Research and Development as a part of 


their Technical Skills of Teaching project- 
Requests for reprints should be sent to John 


E oran, Jr., College of Education, University of 


Orida, Gainesville, Florida 32601. 


The implication of these findings for 
teacher training is that the provision gof 
live or symbolic models displaying desired 
teacher behaviors may provide an effective 
alternative to traditional descriptive tech- 
niques of training (McDonald & Allen, 
1967). Applications of modeling procedures 
to teacher training have demonstrated the 
efficacy of film-mediated models for devel- 
oping questioning skills in science (Koran, 
1969, 1970, 1971), the general superiority of 
film-mediated models over written models, 
and the interaction of individual differences 
with various modeling procedures in the 
acquisition of teaching skills (Koran, Snow, 
& McDonald, 1971). However, these appli- 
cations of modeling procedures to teacher 
training have not explored the sources of 
the effect of the models. That is, in order to 
acquire a teaching skill is it necessary to 


' observe teacher-student interactions, or 


would observation of either the teacher be- 
havior or the student responses alone be 
equally effective in producing the desired 
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behavior? Each of these alternative sources 
of information may conceivably be effective 
in producing desired teacher behavior. The 
teacher-student interaction model conveys 
a wider range of more explicit information 
although both teacher behavior and teacher- 
student models provide behaviors that could 
be observed and later reproduced according 
to the contiguity theory of observational 
learning (Bandura, 1965, 1970; Sheffield, 
1961). Similarly, the observation of student 
behavior alone might serve to clarify the in- 
structional objective sought by the teacher, 
and provide a source of vicarious reinforce- 
ment for covert observer behavior (Ban- 
dura, 1970). 

Moreover, while the use of positive and 
negative information has been studied in 
concept learning and problem solving, it 
has not been studied within the context of 
observational learning. Bandura and Walters 
(1963) point out that models do not nec- 
essarily need to be positive. Negative ex- 
emplary models are often used to demon- 
Strate some undesirable behavior in which 
the consequences of that behavior to the 
model are pointed out, and the observer 
is exhorted not to emulate this example. 
Such models are thought to be less effective 
for learning since they focus on, and some- 
times elaborate, inappropriate behavior that 
may have otherwise received little attention. 
Although the literature on the use of nega- 
tive information strongly suggests that the 
skills necessary to process negative infor- 
mation are generally poorly developed 
(Braley, 1963; Hovland, 1952; Hovland & 
Weiss, 1953; Olson, 1963), there is evidence to 
indicate that specific instruction and practice 
can guide learners to utilize information pre- 
sented in negative instances (Freibergs & 
Tulving, 1961; Fryatt & Tulving, 1963; 
Huttenlocher, 1962). Thus, teacher trainces 
may conceivably be trained to make more 
effective use of negative information in the 
acquisition of teaching skills. 

Such exploration is important for eval- 
uating factors influencing learning efficiency 
when modeling procedures are used in 
teacher training programs. Consequently, 
the purpose of this experiment was to ex- 
amine the effects of three sources of model 
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content (teacher behavior, student behavior 
and teacher-student interaction) and two 
types of information content (positive and 
negative) in an attempt to further clarify 
the effects of modeling procedures in teacher 
training. In each case, the amount and 
type of information communicated varied, 
and patterns of information processing were 
required that were expected to be more or 
less demanding and of more or less value 
as training strategies. Due to the greater 
amount and more explicit nature of the in- 
formation conveyed, it was anticipated that 
observation of teacher-student interaction 
would produce significantly greater behavior 
change than observation of either teacher 
behavior or student behavior alone. While 
both positive and negative models were ex- 
pected to be potentially effective in produe- 
ing behavior change, previous research and 
theory suggested that the positive models 
would be the more effective of the two. 


METROOD 


Subjects 

The experimental sample consisted of 78 fifth- 
year teaching interns at Stanford University, 
University of California, Berkeley, and local 
state colleges with graduate teacher training 
programs. Subjects had a mean Graduate Record 
Examination score of 1,100 and undergraduate 
grade point average of 3.0. All subjects were ex- 
posed to the experimental procedures as an inte- 
gral part of their training. 


Procedure 
Subjects were randomly assigned to one of six 


treatment groups employing three sources 9 | 


model content: teacher-student interaction (TS), 


teacher behavior (T), or student behavior (8); j 


and two types of information content: positiv 
(+) or negative (—) exemplars. The experimenti 
procedures consisted of preliminary set u 
materials and a 30-minute microteaching pre 
followed by the modeling treatment and & subs i 
quent 30-minute microteaching posttest. Thus, ài 
8 X 2 X 2 factorial design was employed consist 
of three sources of information, two types 
information, and two teaching sessions. y 
Film-mediated models for all treatments usn 
30-minutes in duration and filmed simultaneo i 
from the same microteaching interaction. A 
camera captured the total interaction (TS); T) 
camera focused on teacher behavior alone f 
the other camera focused on student beh? ie 
alone (S). The same procedure was used ery, 
veloping both positive and negative m0 


induction | 


- 
l 
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Opportunities for question asking and answering 
were equated in all of the models, 

‘Two criteria measures were used in this study: 
frequency and variety of process-eliciting ques- 
tions. Process-oriented questions, or positive 
questions, were derived from Science-A Process 
Approach Curriculum (Gagné, 1965), and were 
defined as those questions which required students 
to think and respond in terms of the following 
eategories: observation, classification, inference, 
prediction, communication of ideas, hypothe- 
sizing, planning experiments, and interpreting 
data. Negative examples were defined as content- 
oriented rather than process-oriented behavior. A 
frequency score was comprised of the total number 
of process questions generated by the subjects. A 
variety of categories score was comprised of the 
number of different types or classes of questions 
asked. That these two scores measured different 
behaviors can be inferred from the relatively low 
correlation of .36 between these variables. Theo- 
retically, the use of any category of process ques- 
tions by the subjects represents an available class 
of responses in the behavioral repertoire of that 
Subject as well as a single occurrence of a behavior 
of interest. Accordingly, a total frequency score 
representing the use of all categories would demon- 
strate more desirable training effects than would 
the same total frequency score representing the 
use of fewer categories. 

All microteaching sessions were audiotaped. 
Typed transcripts of the microteaching pretest 
and posttest teaching sessions were independently 
tated by three judges. An analysis of variance 
model described by Winer (1962) was used to ob- 
tain the reliability of the mean score for the three 
judges on each category. Interrater agreement 
ranged between .75 and .99 on the categories of 
Process questions and .99 on the frequency of 
process questions. 


Treatment Conditions 


Following the provision of set induction ma- 
terials describing the objectives and procedures 
for the treatment condition, subjects in the TS+ 
group were asked to identify the positive instances 
of teacher and student process-oriented behavior 
When they occurred in the model. Subjects in the 
T+ and S + groups were asked to identify the 
Positive instances of behaviors they viewed, and 
Were additionally asked to induce either the stu- 
dent behavior that would follow the appropriate 
teacher question (T +) or the teacher question 
that would have produced the desired student re- 
Sponses (S +). ; 

. For the negative modeling treatments, subjects 
in the TS — group were asked to identify the 
Negative (content-oriented) instances of teacher 
and student behavior and subsequently to induce 
Positive (process-oriented) examples of teacher 
and student behavior in their place. Subjects in 
the T — and § — groups additionally were asked 
to induce either the negative student behavior 
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elicited by inappropriate teacher questions viewed 
(T —) or the inappropriate teacher question that 
would have elicited the negative student behavior 
(S —) before inducing positive examplars. 

The same number of questions and answers 
were given in positive and negative models. In 
each case, the subjects were required to verbalize 
modeling stimuli (Bandura, 1966). However, sub- 
jects in the negative modeling treatments were 
required to be more active participants since the 
information given to them in the model required 
identification and then the induction of alterna- 
tives. 


REsuLTS AND Discussion 


A 3 (sources of content) X 2 (informa- 
tion types) X 2 (microteaching sessions) 
repeated measures analysis of variance was 
used to test the treatment effects. Means 
and standard deviations of the two depend- 
ent variables in the pretest and posttest 
microteaching sessions are reported in 
Table 1. 

For the frequency of process questions, 
significant effects were found due to sources 
of model content, with subjects in both 
the TS and the S groups generating signifi- 
cantly more process-eliciting questions than 
did subjects in the T only group (p < .01). 
A significant treatment effect was also found 
across teaching sessions (p < .01). Although 
preliminary analysis of simple main effects 
showed that the three groups differed sig- 
nificantly on the posttest, while not on the 
pretest, the absence of a Sources of Content 
X Teaching Sessions interaction suggested 
that pretest differences contributed to the 
observed effect. 

A significant effect was also found across 
teaching sessions for the number of cate- 
gories of process-eliciting questions (p « 
.05). The presence of an Information Types 
X Teaching Sessions interaction showed that 
positive models produced an increase in the 
number of categories of process-eliciting 
questions (p < .05), while negative models 
did not. The analysis of variance results 
for the performance criterion measures are 
shown in Table 2. 

The purpose of this study was to examine 
the relative effects of different sources 
(teacher-student, teacher, student) of posi- 
tive and negative information in observa- 
tional learning. Previous research and theory 
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TABLE 1 
Means AND STANDARD DEVIATIONS OF DEPENDENT VARIABLES 
Treatment groups 
Criterion TS — T+ T- S+ icu 

M | SD| M | sD| x | shù| x | sv| uw | sp 
Frequency of process | Pretest |50.30/15.09/48.55/16.69/39.42/14.2241.33|14.70/52.02|22.03 46.76 10.99 
Questions Posttest |57.43/16.03/55.41/20.37/38.08/17 37/42. 28/16 .22/55.67/18.97/50.70119.86 
Categories of process | Pretest | 5.15] 1.90] 5.69) 1.97) 4.75] 1.91] 4.76] .83| 5.07) 1.77 4,92) 1.03 
Questions Posttest | 6.00} 1.73) 5.84] 1.77] 5.16] 1.80] 4.61) 1.19] 5.85] 1.70] 4.92! 1.60 


Note.—Abbreviations: TS = teacher-student ii 


nteraction; T = teacher behavior; and S = student 


behavior. The + and — connote positive and negative exemplars, respectively. 


in observational learning and concept for- 
mation suggested that TS models would 
produce greater effects than T or S models 
and that positive models would produce a 
greater effect than the negative models. The 
former hypothesis can be only partially ac- 
cepted since TS models were most effective 
overall in producing a greater frequency 
of process questions. However, while both 
TS and S models produced greater perform- 
ance frequencies than T models (p < .05), 
they did not differ significantly from each 
other. The fact that the TS and S models 
were effective in producing behavior change 
Suggests that these models provided a suf- 
ficient behavioral conception of the skill 
to be acquired. Similarly, the particular 


problem-solving procedures used in con- 
junction with the negative models appeared 
sufficiently effective to enable the observers 
to induce the desired process-oriented be- 
haviors and later recall and reproduce these 
in microteaching. Since both the TS and 
S models produced greater effects than the 
T models, one might infer that the student 
component of the model was a vital instruc- 
tional factor, serving perhaps as a source 
of vicarious reinforcement to the subjects 
witnessing TS models and to the covertly 
posed questions of subjects observing $ 
models. Viewing the student behavior com- 
ponent of the TS and S models may also 
serve to clarify the behavioral objective of 
the training session for the observer so that 


TABLE 2 
ANALYSIS or VARIANCE FOR PERFORMANCE CRITERION MEASURES 
Frequency of process-eliciting Variety of process-eliciting 
Source of variation df questions questions 
MS P MS F 
aaa Mi Hake N 
Between subjects 
Sources (A) 2 2,438.49 4.91** 9.38 2.1 
ND 1 "98.55 “19 1.64 E 
dh x 228.01 -46 1.78 40 
Within subjects 496.94 4.46 
Teaching sessions (C) 1 558.67 5.89** 454 4.80* 
EXE 2 155.23 1.63 47 50, 
Poe 1 63 .01 4.54 4.86 
AXBXC 2 5.26 it 
Error 72 95.05 .93 
*p < 05. 


** p < 01. 
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a wide range of ways to elicit the observed, 
or the induced, behavior can be generated 
by the observer. Thus, the amount of in- 
formation conveyed and the effect of stu- 
dent feedback appeared to be important 
mechanisms in the instructionally effective 
models. 

While negative models were effective in 
producing increases in the frequency of 
process questions, they did not produce in- 
creases in the variety of categories of process 
questions, whereas the positive models did. 
This could be attributed to the more limited 
information communicated by these models; 
hence, although an observer could generate 
some categories of process alternatives, the 
information provided was insufficient to 
stimulate the induetion of a wide range of 
categories of process questions. The fact 
that positive models as a group were more 
effective than negative models in producing 
Increased variety of process questioning is 
consistent with the research in concept for- 
mation and information processing. In part, 
this reflects the wider range, greater amount, 
and more explicit nature of the information 
conveyed in the positive instances, and the 
smaller burden they place on memory. How- 
ever, the effectiveness of negative models 
In producing increased frequency of process 
Questions suggests that under proper in- 
structional conditions, learners can be 
trained to process potentially useful negative 
Information. 

Although it has not been demonstrated 
that the training procedures used here rep- 
lésent the most effective way to train 
teachers, these findings suggest that both 
Positive and negative models, and teacher- 
Student interaction and student behavior 
Models, can be successfully used in teacher 
training situations analogous to those used 
In this experiment. Since most teacher train- 
96$ observe negative models in the schools 
at one time or another, it is probably that 
Specific instruction and practice in using 
Such negative information would increase 
tts potential instructional value. Similarly, 
lé combination of viewing models provid- 
Wg both positive and negative examples 
m an analytical context may well exceed 
the effects of one or the other type of model 


409 


alone. Moreover, the importance of wit- 
nessing student responses in the application 
of observational learning procedures to 
teacher training appears to have been con- 
firmed here. It remains for future research 
to determine whether the observation of 
student responses acts (a) to increase the 
observers sensitivity to student behavior; 
(b) to establish a goal-oriented training ses- 
sion in which observers can develop a wider 
range of interactive skills; or (c) perhaps 
as vicarious reinforcement to the observer 
for responses previously made. 
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This study investigated language competence, behavioral adjust- 
| ment, and sex as predictive of first-grade achievement in disadvan- 
| taged Mexican-American children of preschool age. Language and 
behavior measures were obtained from 122 Head Start enrollees and 
76 available subjects were retested for achievement at the end of the 
first grade. Results indicated that the teacher behavior rating of In- 
troversion-Extroversion constituted the strongest predictor of lan- 
guage achievement, whereas English language competence proved to 
be the strongest predictor of math. Spanish language competence 
failed to predict any language variable. Sex failed to predict achieve- 
ment strongly, and did not correlate highly with the most important 


behavioral predictor. 


Research has clearly demonstrated that 
children of all socioeconomically depressed 
ethnic groups manifest deficiencies in 
achievement, and the Mexican-American 
child is no exception. Data collected by the 
United States Office of Education (1966) in- 
dicate that the Mexican-American child's 
achievement test scores are well below na- 
tional norms for all grade levels. Review of 
the literature suggests that systematic 
studies of factors contributing to achieve- 
Ment deficits in Mexican—Americans are not 
Nearly so numerous as those relating to other 
ethnic and racial groups. 

Anderson and Evans (1969) completed 
4 study relating achievement measures to a 
number of variables in a large stratified 
LBS 
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sample of Mexican-American youth of 
junior high school age. Their findings indi- 
cated that variables such as sex, father’s ed- 
ucation, and socioeconomic level are predic- 
tive of achievement in Mexican-American 
youth, just as they are for other groups. 
Among language variables, they found “re- 
ported" use of English in the home to be 
positively predictive of achievement, test 
scores in math but, surprisingly, not to be 
predictive of language scores. Furthermore, 
measures of Self-Concept of Ability and In- 
dependence Training, both nonlanguage 
variables, proved to be the strongest predic- 
tors of all language and math achievement 
measures. 

The present study investigated language 
(direct measures of basic language compe- 
tence in both English and Spanish) and non- 
linguistic (teacher adjustment ratings and 
sex) variables as predictive of first-grade 
achievement in a sample of Mexican-Ameri- 
can Head Start children. Although Anderson 
and Evans failed to demonstrate a predictive 
relationship between language usage and 
language achievement, other research (Mil- 
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ner, 1951) has established such a relation- 
ship with other groups. Thus, the present 
authors felt that the language findings of 
Anderson and Evans might reflect their use 
of indirect and perhaps weak measures of 
language skill (e.g., self-report of language 
usage in the family) and that use of direct 
measures of basic language skill might pro- 
duce much different results. Hence, it was 
hypothesized (a) that direct measures of 
basie English competence would be strongly 
predictive of both language and math 
achievement scores; (b) that these direct 
language measures would predict achieve- 
ment equally as well or better than behav- 
ioral measures; and (c) that, in line with the 
results of Anderson and Evans, Spanish 
language skill would not be predictive of 
either language or math achievement scores. 

Results demonstrating the strength of 
self-concept in predicting achievement (Jer- 
sild, 1952; Reeder, 1955), and the findings of 
Anderson and Evans specifie to Mexican- 
Americans, suggested that behavioral adjust- 
ment measures would also constitute strong 
predictors of achievement in Mexican-Amer- 
ican children of Head Start age. Thus it was 
hypothesized that teacher behavior ratings, 
as reflective of personality constructs in the 
children, would predict achievement in both 
language and mathematics, though perhaps 
not as powerfully as English language skill. 
Since the rating scale provided for evalua- 
tions in three factorially separate behavioral 
areas (Positive or Negative Task Orienta- 
tion, Positive or Negative Social Behavior, 
and Introversion-Extroversion), it was pos- 
sible to make special predictions for each fac- 
tor. On the basis of face validity, it was hy- 
pothesized that Task Orientation would be 
most related to achievement and thus con- 
stitute the best predictor of achievement 
measures. The Social Behavior and Introver- 
Sion-Extroversion factors were expected to 
follow in that order. Finally, it was antici- 
pated that sex would exercise its usual pre- 
dictive influence, especially with regard to 
language variables. 


METHOD 


Subjects 


The original sample consisted of Mexican- 
American children livingin a predominantly lower- 
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class Mexican-American section of west San An- 
tonio, Texas. All were 5-year-old Head Start en- 
rollees during the summer of 1968, and were lo- 
cated in 10 separate classrooms within five 
separate shools of the San Antonio Independent 
School District. Complete language and behay- 
ioral data were collected for 122 of these subjects, 
Of these 122, complete follow-up data were avail- 
able for 76 subjects, most of whom were attending 
publie schools located in the same west San 
Antonio geographical area. Sample attrition 
between the end of Head Start and the end of first 
grade was caused primarily by family migration 
from the area. 

The families of all children studied were at or 
below the ‘‘poverty line” index, established by the 
Office of Economie Opportunity to determine el- 
igibility for the Head Start program. This index 
is based on two variables (family size and annual 
income) and ranges from $2,000 for a family of 2 to 
$7,800 for a family of 13. 


Measures 


Classroom Behavior Inventory: Preschool to Pri- 
mary (CBI). This 60-item instrument, developed 
by Schaefer (1971), requires that teachers rate each 
child on a 4-point scale, ranging from very much 
like to very little like. Factor-analytic studies 
with this instrument, using a principal components 
analysis, have yielded three factors. Scales de- 
scribing Verbal Expressiveness and Gregarious- 
ness load positively (,92 and .91) and scales describ- 
ing Social Withdrawal and Self-Consciousness lo 
negatively (—.83 and —.80) on Factor 1, Extro- 
version versus Introversion. Scales related to Kind- 
ness and Consideration load positively (20 and.88) | 
and those describing Irritability and Resentful- - 
ness load negatively (—.94 and —.90) on Factor 2, 
Positive Social Behavior versus Social Hostility. 
Finally, scales describing Perseverance and Con- 
centration load positively (.89 and .93), wis 
those describing Hyperactivity and Distractabil- 
ity load negatively (—.48 and —.53), on Factor $ 
Positive Task Orientation versus Negative Tas 
Orientation. A subject’s score for each of the three 
factors was determined by subtracting summate 
poor adjustment factors from summated adequate 
adjustment factors (e.g., Verbal Expressiveness 
Gregariousness—Social Withdrawal — Self-Con- 
sciousness = Introversion or Extroversion score). 
Additionally, an overall adjustment score W38 
calculated. This was accomplished by summing | 
the modified factors scores for each child, taking 

| 


account of the algebraic sign. Thus, for each EU 
ject, behavior measures on each factor and 9) 
overall adjustment score were available. A 
Test of Basic Language Competence (TBLC) V 

English and Spanish (Level 1): Children ages Ai 
Language ability in English and Spanish bt D) 
sessed by this test, developed by Cervenka. (190.7 | 
This instrument, consisting of nine Engl | 
nine parallel Spanish subtests, is designe 


d to 88- i 
sess the basic language competence of children 
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who speak English or Spanish, or who are bilingual 
in these two languages. It is anticipated that chil- 
dren of ages from 5 to 6 who live in Texas and who 
speak English or Spanish well—or both languages 
as native languages—will make a near perfect 
score on the subtests of the battery. Normative 
and content validity studies, reported by 
Cervenka (1967), substantiate this assumption, 
at least for those subtests included in the present 
study. The construct of “Basic Language Com- 
petence’’ is defined by the child's score on each 
of the nine subtests which sample various dimen- 
sions of basic language. Subtests include the 
following: (1) Oral Vocabulary; (2) Compre- 
hension of Commands and Directions; (3) Rec- 
ognition of Interrogative Patterns; (4) Phonemic 
Discrimination at Word Level; (5) Production of 
Grammatical Structures; (6) Assimilation of 
Meaning; (7) Phonemic Discrimination at Sen- 
tence Level; (8) Grammatical Sensitivity; and (9) 
Grammatical Discrimination. Due to time limi- 
tations, only four subtests were administered. 
They measured the following competencies: 


1. Oral Vocabulary—knowledge of a limited 
number of lexical items; 

2. Comprehension of Commands and Direc- 
tions—ability to comprehend specific com- 
mands and directions in a language; 

8. Recognition of Interrogative Patterns— 
ability to recognize basic interrogative 
patterns in a language; 

4. Phonemic Discrimination at Word Level— 
ability to hear and produce sound contrasts 
in a language (between pairs of words). 


For the present study, scores on these four 
subtests were combined to give total English or 
Spanish competence scores. Split-half reliabilities 
for the Head Start sample were calculated for total 
English and Spanish scores by the Spearman- 
Brown formula, For 132 subjects completing the 
English ability test, the scale reliability was .93; 
and 122 subjects completing the Spanish version 
attained a reliability score of .87. 

Metropolitan Achievement Test: Form A (MAT). 
Achievement measures for Word Knowledge, 

ading, and Mathematics concepts and skills 
Were obtained from this standardized and widely 
used instrument. All raw scores were transformed 
to their standard score equivalents, and these 
Measures constituted the dependent variable for 
the present study. 


Procedure 


, The Classroom Behavior Inventory was admin- 
istered by classroom teachers 2 weeks after the 
beginning of the Head Start program. It was hoped 
that this delay would allow temporary adjustment 
culties to clear, so that ratings might reflect 
other than transitory adjustment problems. The 
Tests of Basic Language Competence were admin- 
Istered individually, and testing took place in the 
School, occurring during the second and third week 
of the Head Start Program. 
Achievement measures were collected at the 
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end of the first-grade year (April and May, 1969). 
Tests were administered on a group basis by the 
senior author and several monitors who aided in 
supervision of the testing. 


RESULTS 


Results were analyzed by an iterative 
multiple correlation as described by Botten- 
berg and Ward (1963), employing computer 
programs similar to those reported in Veld- 
man (1967). A stop criterion of .00001 was 
used for increase in R?. The first iteration of 
the procedure selects the variable with the 
highest validity. The second iteration in 
turn selects the next predictor variable 
which, together with the first predictor, will 
maximally increase R?. In each of the re- 
maining iterations, either another variable 
will be added which maximally increases R? 
or a weight already in the predictor set will 
be altered. In each instance, the decision de- 
pends on the maximal increase in R?. When 
no additional predictor variables are avail- 
able which will increase R? < .00001 or when 
no further alteration of weights is necessary, 
the iteration procedure stops. This procedure 
converges toward a least-square solution. 

Table 1 shows the results of predicting 
Word Knowledge from all the predictor vari- 
ables, with an overall multiple correlation of 
.56. The data indicate that teacher rating of 
Introversion-Extroversion, a behavioral 
measure, constituted the best predictor of 
Word Knowledge and correlated higher with 
Word Knowledge (r = .49) than any other 
variable used, including the measure of basic 
language skill. As shown in Table 1, the next 
variable most predictive of Word Knowledge 


TABLE 1 
ITERATION SEQUENCE OF THE MULTIPLE 
ConnELATION PnEDICTING Worp 
KNOWLEDGE AcHInVEMENT 


Zero-order 

Predictor variable Ran |) core ate 

knowledge 
Extroversion .24 .49 
Positive task orientation 27 .39 
Sex .29 .05 
English language skill .30 .39 
Spanish .81 —.08 


Note.—Thirty-six additional iterations resulted 
in a final E of .56. 
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TABLE 2 
ITERATION SEQUENCE OF THE MULTIPLE 
CORRELATION PREDICTING 
READING ÁCHIEVEMENT 


pese 

Predictor variable R | vith reading 
Extroversion .14 37 
Sex 16 ll 
English language skill 18 .30 

Extroversion + positive social 

behavior 48 .26 
Spanish language skill .19 —.01 


Note.—Twenty-one additional iterations re- 
sulted in a final R of .43. 


in combination with Introversion-Extro- 
version was Positive Task Orientation, fol- 
lowed, in order, by sex. Using the regression 
procedure described above, sex was included 
as a predictor variable by assigning a value 
of 1 to males and a 0 to females. English 
language skill, the fourth variable, entered 
the iteration sequence only after the contri- 
bution of the two behavior ratings and sex 
were accounted for. 

Table 2 shows the results of predicting 
Reading achievement from the predictor 
variable set used in this study. The multiple 
correlation was .43. Once again, the be- 
havioral rating of Introversion-Extroversion 
was the best predictor, followed by sex. 
English skill was the third variable to enter 
the iteration sequence. However, it added 
little predictive power, in that it increased 
R? only slightly (2.03% of the variance of 
Reading achievement). 

Math scores were also predicted from the 
predictor set of the three behavioral meas- 
ures, English and Spanish language skill 
and sex. The multiple correlation, as shown 
in Table 3, was .78. Although English com- 
petence showed little additional power in 
predicting Word Knowledge and Reading 
achievement, it was highly predictive of 
Math achievement (r = .70). Extroversion- 
Introversion was second to enter the intera- 
tion sequence followed in order by Spanish 
language skill. The paradoxical situation rep- 
resented by the data indicates that language 
skill is relatively inefficient in predicting 
language achievement but highly predictive 
of Math achievement. 
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TABLE 3 


ITERATION OF THE MULTIPLE CORRELATION 
PREDICTING MATH ACHIEVEMENT 


Predictor variables R 


with math 

achievement 
English language skill 48 -70 
Extroversion „54 .56 
Spanish language skill 59 .28 
Positive social behavior .61 .45 


Note.—Nine additional iterations resulted in a 
final R of .78. 


A series of F tests were computed to de- 
termine if English language skill and Spanish 
language skill, taken independently or in 
combination added significantly to the effi- 
ciency of the full model in predicting Math. 
The full model includes all language meas- 
ures, all behavioral measures, and sex. 

In other words, the following questions 
were asked: 


1. For Head Start learners of the same sex, same 
behavioral ratings, and same Spanish language 
competence but different English language com- 
petence, is there a difference in expected Math 
achievement, (Hypothesis A) 

. For Head Start learners of the same sex, same 
behavioral rating, and same English language 
competence but different Spanish language 
competence, is there a difference in expecte 
Math achievement. (Hypothesis B) 

. For Head Start learners of the same sex, same 
English language competence, and same Span- 
ish language competence but different behav- 
ioral ratings, is there a difference in expected 
Math achievement. (Hypothesis C) d 

. For Head Start learners of the same sex ant 
same behavioral ratings but different English 
and Spanish language competence, is there & 
difference in expected Math achievement. 
(Hypothesis D) 


to 


eo 


p 


As shown in Table 4, both English inde- 
pendently considered (p < .0001; Hypoth- 
esis A) and Spanish independently com 
sidered (p < .002; Hypothesis B) as well 5 
English and Spanish taken together (p 2 
.0001; Hypothesis D) contributed signifi; 
cantly to the full model. The behavio 
measures (Hypothesis C) significantly pt 
dicted Math achievement in combinato 
with the other predictions in the full mod 
Remember that all predictor variables wet? 
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TABLE 4 
F Tests or Mur.TIPLE-REGRESSION ANALYSIS FOR MATH ACHIEVEMENT 
Dependent variable Predictor variables R 
Math achievement | Extroversion, positive social behavior, positive task orientation, 
English language competence, Spanish language competence, 
and sex 78 
2 | Math achievement | Extroversion, positive social behavior, positive task orientation, 
Spanish language competence, and sex 71 
3 | Math achievement | Extroversion, positive social behavior, positive task orientation, 
English language competence, and sex 75 
4 | Math achievement | English language competence, Spanish language competence, and 
sex 73 
5 | Math achievement | Extroversion, positive social behavior, positive task orientation, 
and sex .65 
Hypotheses | Full model | Restricted ah df LÀ ? 
A 1 2 L 70 21.10 .0001 
B 1 E] 1 70 10.30 -002 
Cc 1 4 3 70 4.78 .004 
D 1 5 2 70 17.58 -0001 


obtained at the beginning of the Head Start 
program and all criterion variables were ob- 
tained at the end of the first grade. 


DISCUSSION 


Much to our surprise, the results of the 
present study indicated that measures of 
Student adjustment constituted the strongest 
predictor of language achievement. English 
language skill, though measured directly and 
positively correlated with language measures 
(39 with Word Knowledge and .27 with 
Reading), not only failed to constitute a 


| stronger predictor of language than behav- 


ioral measures but also failed to add signifi- 
cantly to the predication of either Word 


| Knowledge or Reading in combination with 


behavioral measures. These results are con- 
trary to expectations but are in line with the 
findings of Anderson and Evans (1969). 

The positive relationship between English 
language skill and Math achievement was 
expected, but the strength of this relation- 
ship (r = .70) was not anticipated. However, 
Teview of the math concepts and skills, 
sampled by the Metropolitan Achievement 
Test suggests that their attainment might 
well be enhanced by higher proficiency in 
basic English language skills. That is, the 
Mexican-American child who is more pro- 


ficient in basic vocabulary, in comprehending 
questions, and particularly, in following in- 
creasingly complicated directions in English, 
might be better able to follow the teacher’s 
oral language instruction, leading to the at- 
tainment of preliminary math concepts and 
skills. 

In fact, there is some evidence to suggest 
that, with regard to math achievement, basic 
English competence is more important to 
Mexican-American than to Anglo children. 
Cervenka (1969), studying an essentially 
English speaking 6-year-old group, found 
only moderate relationships between Eng- 
lish competence and Math achievement on 
the SRA series (.49 with concepts, .41 with 
reasoning). Thus, English competence was 
of much less importance to Cervenka’s 
young English speakers than to the young 
Mexican-Americans of the present. study. 
However, it should be pointed out that 
Cervenka’s sample was a predominantly 
middle-class group, so that direct comparison 
is confounded by socioeconomic class dif- 
ferences. 

In the same study, Cervenka found math 
and English competence to be less highly 
correlated at age 7 (.35 range) and to be in- 
significantly related at age 8. These results 
suggest that, with English speaking children, 
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'TBLC language factors are of some impor- 
tance in acquiring early math skills but 
eventually become less crucial. Unfortu- 
nately, there is no directly parallel evidence 
with Mexican-American children. Anderson 
and Evans did find, however, that their in- 
direct, measures of the English language with 
Mexican-American children remained sig- 
nificantly correlated with Math achievement 
longer than for Anglo children. Further in- 
vestigation with parallel socioeconomic 
groups would be necessary to corroborate 
this speculation, however. 

Spanish language measures failed to pre- 
dict language achievement scores as antici- 
pated, and even resulted in negative correla- 
tions with both language measures (—.03 
with Word Knowledge and —.01 with 
Reading). Hence, although English language 
skill was inferior to behavioral measures in 
predicting language achievement, it was far 
superior to Spanish language skills. Though 
results suggested that Spanish language skill 
did contribute to Math in a statistically 
Significant manner (see Table 4), inspection 
of the data indicates that its contribution 
was small compared to that of English 
language skill. 

As expected, teacher behavior ratings were 
predictive of all achievement measures, but 
the strong contribution of the Introversion— 
Extroversion factor was not anticipated. 
Examination of the subtests (Gregariousness 
and Verbal Expression vs. Self-Unconscious- 
ness and Social Withdrawal) suggests that 
the child higher in extroversion is one who is 
verbally assertive, especially with regard to 
speaking up in the classroom situation, and 
who actively seeks out contacts with both 
peers and adults. In contrast, the more in- 
troverted child, as measured by the CBI, 
prefers not “to perform” for the teacher and 
is less open to social interaction with teachers 
or peers. Perhaps the behavior patterns of 
the more extroverted Mexican-American 
child make him more receptive to the 
teacher-student interaction, and renders 
him less susceptible to negative evaluation 
by the teacher. 

This speculation finds indirect support in 
a recent study by Bellar (1969), which inves- 
tigated relationships between dependence, 
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autonomous functioning, and IQ gains in 
Negro Head Start children. Results indicated 
that “IQ gainers” made realistic demands 
for help from teachers (instrumentally de- 
pendent behavior), received more positively 
reinforcing reactions from teachers in re- 
sponse to their requests, and coped more 
effectively with failures to receive solicited 
help by attempting the work independently 
or seeking out another adult. Additionally, 
these children were found to be more auton- 
omous in learning cognitive tasks under 
conditions of “intrinsic reinforcement” (ex- | 
perience of successful outcome only), reflect- 
ing a tendency to learn for the sake of learn- 
ing. Perhaps CBI measures of Extroversion 
tap constructs similar to Instrumental De- 
pendency and Autonomy in learning and, 
thus, mediate prediction of achievement suc- | 
cess. Further research might address itself 
to student-teacher patterns in Extroverted 
versus Introverted groups and relationships 
to achievement success. 

Sex did not prove to be a strong predictor 
of any achievement criteria and failed to 
correlate significantly with any achievement 
variable. Additionally, sex failed to correlate 
significantly with either language variable 
(—.19 with English skill and .02 with 
Spanish skill), indicating that both boys 
and girls manifested approximately equal 
language competencies. Finally, although 
sex was a significant codeterminant for 
teacher behavior ratings of Social Behavior 
(—.28) and of Positive Task Orientation 
(—.23), it failed to correlate significantly 
with  Introversion-Extroversion (—: 
Thus, sex was not an important determinant | 
of the most; potent behavioral predictor. In 
general then, this disadvantaged Mexican- 
American sample manifested a high degree 
of cross-sex homogeneity, a finding not uh 
usual in recent research with young disad- 
vantaged populations (Boger & Ambron 
1969; Stephenson, 1969). 
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PERCEIVED REWARD VALUE OF TEACHER REINFORCEMENT 
AND ATTITUDE TOWARD TEACHER: 


AN APPLICATION OF NEWCOMB'S BALANCE THEORY! 


DEWITT C. DAVISON? 
^ University of Toledo 


A 20-item questionnaire representing typical positive and negative re- 
inforcing behaviors of teachers was administered to 256 eighth-grade 
students. Subjects responded to the questionnaire twice, once in ref- 
erence to a “Best Liked" teacher providing the reinforcement and 
once in reference to a “Least Liked” teacher providing the reinforce- 
ment. Subjects indicated their feelings about the reinforcement by 
choosing from among five statements ranging from highly favora- 
able to highly unfavorable. The significance attached to the positive 
reinforcement was related to subjects’ attitude toward the teacher, 
sex, and social class. The significance attached to the negative rein- 
forcement was also related to subjects’ attitude toward the teacher, 
but not to sex and social class. The relationship of attitude toward 
the teacher and receptiveness to teacher reinforcement was concept- 
ualized in terms of Newcomb’s balance theory. 


K 


In studying the effects of reinforcement 
on behavior, the focus of attention has been 
mainly on subjects’ overt responses to rein- 
forcement as the behavioral characteristics 
of interest. Few investigators have been con- 
cerned with individuals’ affective reactions 
to the stimulus events which are intended 
to reinforce their behavior. Yet, how they 
overtly respond to these events depends in 
part on how they perceive them and the 
Significance they attach to them. 

In the present study, the significance 
students attached to certain teacher be- 
haviors which were intended to reinforce 
them was examined in relation to their at- 
titude toward the teacher, their sex, and 
social-class background. A number of in- 
vestigators have found subjects’ overt re- 
sponsiveness to reinforcement to be posi- 


1 This article is based on portions of the author’s 
doctoral dissertation submitted to the Graduate 
College of the University of Illinois, Urbana, 
Illinois. The author is indebted to Norman E. 
Gronlund for the valuable assistance he provided 
with this study. 3 

2 Requests for reprints should be sent to Dewitt 
C. Davison, College of Education, University of 
Toledo, Toledo, Ohio 43606, 
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tively related to their attitude toward the 
dispenser of the reinforcement, (Ferguson & 
Buss, 1960; McCoy & Zigler, 1965; Sapolsky, 
1960; Simkins, 1961). The findings bearing 
on the relationship of social class to rem- 
forcer effectiveness are generally mix 
(Douvan, 1956; McGrade, 1965; Rosenhan 
& Greenwald, 1965; Terrell Durkin, & 
Wiesly, 1959; Zigler & DeLabray, 1902; 
Zigler & Kanzer, 1962). And although sex 
differences in responsiveness to reinforce- 
ment have been observed in a number of 
studies (Ferguson & Buss, 1960; Meyen 
Swanson, & Kauchack, 1964; Rosenhan 
Greenwald, 1965; Rowley & Stone, 1964; 
Stevenson, 1961; Stevenson, Keen, 
Knights, 1963), these results are similarly 
inconclusive, as the sex group found to 
more responsive varied with different, stud- 
les. / 
'The studies cited were concerned vith 
subjects’ attitudes, sex, and social class A 
relation to positive reinforcement only. int 
this investigation these variables were € 1 
amined in relation to both positive and neg 
tive reinforcement. E 
As noted, there is evidence of a positive 
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relationship between the individual's overt 
responsiveness to positive reinforcement and 
his attitude toward the dispenser of the re- 
inforcement. However, the dynamics of this 
relationship have been given scarce attention 
in the literature. One of the ways it may be 
conceptualized is in terms of Newcomb's 
(1953) balance theory. His basic paradigm 
involves the co-orientation of two individuals 
(A and B) with respect to each other, and a 
third concept (X) which may be any person, 
object, event, or idea. The attitude of Person 
A toward X is conjointly related to A's 
attitude toward B and his perception of B's 
attitude toward X. An individual tends to 
agree with those toward whom he holds a 
positive attitude and to disagree with those 
toward whom he holds a negative attitude. 
With reference to the problem described, if 
the student has a positive attitude toward 
the teacher, he tends to agree with the 
Spirit that he perceives is being held by the 
teacher in offering the reinforcement. On the 
contrary, if he has a negative attitude toward 
the teacher, he is inclined to reject, to some 
extent, the intent of the reinforcement, since 
his unqualified acceptance of it would imply 
agreement with someone he dislikes. It 
should be noted that the terms “agree” and 
“disagree,” as applied in the context of this 
study, do not denote opposite states, but 
are used in a sense relative to each other, 
and represent differences in degrees rather 
. than differences in kind. Thus, given two 
attitudes by students toward teachers, one 
positive and one negative, it was hypothe- 
sized that they would perceive the positive 
reinforcement provided by liked teachers 
as being more rewarding than that provided 
by disliked teachers. Correspondingly, they 
should perceive the negative reinforcement 
provided by liked teachers as being more 
aversive—and hence its removal as more 
rewarding—than that provided by disliked 
teachers, ; 


Merrnop 
- Subjects 
The subjects for the study were 118 male and 
188 female eighth-grade students from three 
communities in Illinois. The communities were 
predominately white and all had populations of 
b varying economic backgrounds. 
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Procedure 


A 20-item questionnaire consisting of typical 
classroom reinforcing behaviors of teachers was 
developed prior to the study. Items for the ques- 
tionnaire were provided by 77 eighth-grade stu- 
dents who were not a part of the sample for the 
main study. The students were given a mimeo- 
graphed paragraph describing common classroom 
episodes which illustrated student-behavior- 
teacher-reinforcement sequences. The examples 
included instances of both positive and negative 
reinforcement and the associated student be- 
haviors. The students were then asked to provide 
as many similar episodes as they could think of 
that they had witnessed. The teacher-reinforcing 
behaviors chosen for items in the questionnaire 
were those listed most often by respondents. A 
complete description of the questionnaire and its 
development is reported elsewhere (Davison, 
1967). Twelve of the questionnaire items repre- 
sented positive reinforcers and 8 represented 
negative reinforcers. Subjects responded to the 
questionnaire twice, once in reference to a “Liked” 
teacher providing the reinforcement and once in 
reference to a “Disliked” teacher providing the 
reinforcement. The two teacher-referent con- 
ditions under which the questionnaire was ad- 
ministered were separated by 1 week, the order 
being reversed for half of the subjects. On each 
occasion, before responding to the questionnaire, 
each subject was asked to select the teacher he 
most (or least) preferred to have teach him, with- 
out identifying the teacher by name, and indicate 
his attitude toward the teacher on a 5-point, 
descriptive scale, The options on the scale ranged 
from “I like him (or her) very much" to, “I dis- 
like him (or her) very much.’’ The results from this 
scale were used only as a basis for confirming the 
subject's attitude toward the teacher chosen, To 
respond to the questionnaire items, subjects chose 
from among five statements the one most indic- 
ative of their feelings about the reinforcing 
behaviors in relation to one of the teacher ref- 
erents. Following is an item taken from the ques- 
tionnaire: 

Suppose you were in this teacher's class and 
he (or she) was busy doing something in the 
hall, and your classmates became loud and 
you tried to quiet them. If this teacher saw 
you after class and praised you for what you 
did, how would you foel?” 

A. I would feel very good if this teacher did 


this. 
B. I would feel good if this teacher did this. 
C. I would feel neither good nor bad if this 
teacher did this. 
D. I would feel bad if this teacher did this. 
E. I would feel very bad if this teacher did 


this. 

Upon completion of the questionnaire, the 
subjects were asked to provide certain personal 
data, including their sex, age, and parents’ oc- 
cupations and educational backgrounds. The 
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latter information was needed to obtain an index 
of each subject’s social-class position. 


RESULTS 


Each subject had four scores, one for 
positive reinforcement dispensed by the 
“Liked” teacher, one for positive reinforce- 
ment dispensed by the “Disliked” teacher, 
and two corresponding negative reinforce- 
ment scores. The responses to the question- 
naire items were assigned to three categories, 
based on whether a subject responded posi- 
tively, neutrally, or negatively to the rein- 
forcement. The positive category included 
Options A and B (see sample questionnaire 
item), the neutral category included Option 
C, and the negative category included Op- 
tions D and E. In scoring the positive rein- 
forcement items, responses in the positive 
category were assigned three points, those 
in the neutral category, two points, and those 
in the negative category, one point. The 
point values were assigned in the reverse 
order for the negative reinforcement items. 

There were 12 items in the scale represent- 
ing positive reinforcers. Nine of the 12 items 
consisted of social reinforcers and the re- 
maining 3 consisted of material reinforcers, 
The results on these three items were not 
included in this portion of the analysis. 
Therefore, for the items measuring positive 
reinforcement, scale values may have ranged 
from 9 to 27 points. Scale values for the eight 
negative reinforcement items may have 
ranged from 8 to 24 points. 

The subjects were classified into three 
social-class groupings, based on Hollings- 
head’s (1965) Two Factor Indez of Social 
Position. Table 1 contains a summary of stu- 
dents by sex and social-class membership. 


TABLE 1 


NumsER or SussEcTs WwiTHIN EAcm SEX AND 
Soctat Crass Group 


SS ——— 


Social class 
Sex 
Lower | Middle | Upper | An groups 
Boys 52 47 19 118 
Girls 60 60 18 138 
Both sexes 112 107 37 | 256 


DAVISON 
TABLE 2 
MEANS or POSITIVE REINFORCEMENT SCORES 
Boys Girls 

Social class 
Liked | Disliked | Liked | Disliked 
teachers | teachers | teachers | teachers 
Upper 24.84 | 23.21 | 22.04 | 22.55 
Middle 25.96 | 24.77 | 25.18 | 23.57 
Lower 25.50 | 23.81 | 24.70 | 24.00 
All groups 25.43 | 23.92 | 24.27 | 23.97 


The means of positive reinforcement scores 
for various groups are shown in Table 2. 
An analysis of variance of the positive re- 
inforcement scores indicated significant main 
effects of attitude toward teacher (F = 
41.93, df = 1/250, p < .001), sex of subject 
(F = 5.28, df = 1/250, p < .05), and sub- 
jects’ social class (F = 5.72, df = 2/250, p 
< .01). As predicted, subjects responded 
more favorably to positive reinforcement 
offered by “Liked” teachers than by “Dis- 
liked” teachers. Boys responded more favor- 
ably to the positive reinforcement than did 
girls. There was also a significant social-class 
difference. Tests of the means with Duncan's 
new multiple-range test (Kramer, 1956) re- 
vealed that middle-class subjects perceived 
the positive reinforcement as being signifi- 
cantly more rewarding than the upper-class 
subjects (p < .01). Lower-class subjects 
were intermediate between the middle- and 
upper-class subjects in how they regarded 
the positive reinforcement, and did not differ 
significantly from either of the other two 
groups. j 
The means of various groups for negative 
reinforcement scores are shown in Table 3. 
An analysis of variance of the negativo 
reinforcement, scores revealed a significar" 
main effect of attitude toward teacher (F m 
26.79, df = 1/250, p < .001). There were 1o 
Significant main effects on the dependent 
variable due to sex or social class. In d 
mary, when negative reinforcement was yg 
vided by a “Liked” teacher, subjects d 
garded it with more aversion than when 
was provided by a “Disliked” teacher. Thee 
findings were also in agreement with t A 
prediction. However, there were no 
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PERCEIVED REWARD VALUE OF TEACHER REINFORCEMENT 


TABLE 3 
Mrans or NEGATIVE REINFORCEMENT SCORES 


Boys Girls 
Social class 

Liked Disliked | Liked Disliked 

teachers | teachers | teachers | teachers 

Upper 21.32 | 19.74 | 21.44 | 20.72 
Middle 22.26 | 21.36 | 22.70 | 21.16 
Lower 21.48 | 20.69 | 21.80 | 21.05 
All groups 21.84 | 20.85 | 22.19 | 21.06 


cant differences between sexes and social 
classes in how they responded to the negative 
reinforcement. 


Discussion 


It is widely recognized that how individ- 
uals respond to events is determined in part 
by their perceptions of the events. Given 
this relationship, it may be inferred from this 
study that the attitude of the student toward 
the teacher significantly affects the extent 
to which he is able to influence his behavior 
through positive and negative reinforcement. 
Negative attitudes toward the teacher from 
Students apparently have the effect of di- 
minishing the reward value they assign to his 
positive reinforcement. Moreover, the aver- 
Sive quality of his negative reinforcement 
I$ regarded by them as being less penalizing, 
under such conditions. The basis of their 
resistance to reinforcement; under these cir- 
cumstances may be explained in one way by 
the tendency of individuals to avoid orienta- 
tions to events which align them with the 
apparent orientations of persons they reject. 
To accept the reinforcement in the spirit that 
it is offered would imply alignment with the 
Source, and, to some extent, endorsement 
of that source. 

The sex differences that were observed in 
the students’ reactions to the positive rein- 

Orcement should be viewed with caution. 
Related findings have been obtained in other 
Investigations (MeManis, 1965; Rosenhan & 
Greenwald, 1965). But other studies have 
Shown results inconsistent with these findings 
(Ferguson & Buss, 1960; Rowley & Stone, 
1964; Stevenson, 1961; Stevenson et al., 
1963). However, these studies concentrated 
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on the subjects’ overt responses to the rein- 
forcement rather than their affective re- 
sponses to it. 

Although a significant difference was noted 
between two of the social classes in how they 
responded to the positive reinforcement, 
these findings are also not clear. The rela- 
tionship between social class and reaction 
to reinforcement may not be a simple one. 
An examination of the literature reveals two 
opposing positions on this question. On the 
one hand, there is the view that the lower- 
class child is conditioned by his environment 
to value the intangible rewards associated 
with the classroom less highly than his mid- 
dle- and upper-class contemporaries. This 
argument draws heavily on the works of 
Davis (1941), Douvan (1956), and Ericson 
(1947). On the other hand, there is the argu- 
ment that since the lower-class child comes 
from a background in which he has often 
been deprived of social support, he develops 
a greater need for such and is therefore more 
responsive to it when it is offered. However, 
the middle- and upper-class child’s needs 
for social support are satiated, by virtue of 
their backgrounds. The principal exponent 
of this notion is Rosenhan (1966). Consider- 
ing both arguments, it is possible that the 
outcome in this investigation relative to the 
influence of social class on reaction to posi- 
tive reinforcement was in part a reflection 
of these two conflicting tendencies. 
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CORRELATION OF PAIRED-ASSOCIATE PERFORMANCE WITH 
SCHOOL ACHIEVEMENT AS A FUNCTION OF TASK 
ABSTRACTNESS' 
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To explore the effects of task and sample variation upon the relation. 
between paired-associate performance and school achievement, 239 
Caucasian boys and girls from the third and sixth grades in lower and. 
upper-middle socioeconomic level schools performed two paired-as- 
sociate tasks which differed only in the abstractness versus concrete- 
ness of their stimulus and response materials. School achievement. 
was positively related to performance on the abstract task and was 
generally unrelated to performance on the concrete task. As hypothe- 
sized, the correlations between the abstract task and school achieve- 
ment were independent of the correlations between the paired-associ- 
ate tasks, indicating that the processes underlying paired-associate 
performance, and probably school achievement as well, are not widely 
generalizable but are dependent upon specific task and sample char- 


acteristics. 


The paired associate has been a standard 
Paradigm in psychology for decades. Re- 
cently, paired-associate learning has gained 
notoriety in educational research and theory 
as it pertains to ethnic and social class differ- 
ences in learning ability (Jensen, 1969; Roh- 
wer, 1968, 1971; Stevenson, 1969) and in its 
telationship to IQ and school achievement 
(Rohwer, 1971; Stevenson, Hale, Klein, & 
Miller, 1968). The controversy centers 
around identyfying the mental processes 
ee 
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Cooperation, and Harold Stevenson, Gordon Hale, 
and Katherine Gray for their assistance. A more 
detailed report of the study is available from the 

t author, 

p * Requests for reprints should be sent to D. H. 
eldman, Department of Psychology, Yale Uni- 
versity, New Haven, Connecticut 06510. 


involved in paired-associate learning and 
assessing their significance for education. 

Rohwer (1971) found that performance 
deficits on paired-associate tasks for disad- 
vantaged children could be remediated by 
training them to use "mental elaboration 
skills”; he argued that the effects of these 
mental elaboration skills might also extend 
to improving school achievement; for dis- 
advantaged children. Rohwer claims em- 
pirical support for this inference from paired- 
associate learning to school achievement in 
the substantial correlation between paired- 
associate learning and school achievement. 
reported by Stevenson et al. (1968). 
Rohwer’s argument can be summarized as 
follows: Mental elaboration skills improve 
paired-associate performance in disadvan- 
taged children (Rohwer, 1971); paired-as- 
sociate performance is correlated with school 
achievement (Stevenson et al., 1968); there- 
fore, mental elaboration skills will improve 
school achievement. 
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Feldman (1969) suggested that Rohwer 
may have been unjustified in juxtaposing 
his findings with those of Stevenson et al. 
because of important differences among the 
subjects, stimulus materials, and tasks sam- 
pled by the two studies. Most importantly, 
the abstract nature of the paired-associate 
materials used by Stevenson et al. (as con- 
trasted with Rohwer’s concrete materials) 
could account for the relatively high correla- 
tions with school achievement and IQ. Con- 
ereteness, meaningfulness, and familiarity 
of the stimuli have proved to be relevant 
variables in paired-associate learning (Goss 
& Nodine, 1965; Klein, Hale, Miller, & 
Stevenson, 1967; Paivio & Yuille, 1966; 
Underwood & Schulz, 1960). 

The purpose of the present study was not 
to test directly Rohwer’s speculation that 
mental elaboration skills would improve 
school achievement. Rather, we questioned 
Rohwer’s assumption of equivalence between 
his and Stevenson's paired-associate tasks 
in their ability to predict school achievement 
in spite of the difference in the abstractness 
of the two paired-associate tasks. We at- 
tempted to determine empirically the actual 
interrelations among the two paired-asso- 
ciate tasks and school achievement. To the 
extent that the correlation between Rohwer’s 
and Stevenson’s paired-associate tasks is 
independent of the correlation between 
Stevenson’s paired-associate task and school 
achievement, Rohwer’s argument is weak- 
ened. 

We therefore hypothesized that the 
paired-associate/school achievement correla- 
tions reported by Stevenson could reason- 
ably be attributed to the abstractness of his 
task and be independent of any variance 
common to Rohwer’s and Stevenson’s 
paired-associate tasks. To test this hypoth- 
esis we contrasted stimulus materials in the 
paired-associate task from concrete and 
familiar to abstract and unfamiliar. Sub- 
jects were stratified on sex because of previ- 
ous findings of different patterns of correla- 
tions for boys and girls (Stevenson et al., 
1968). Subjects were sampled from both 
the third and sixth grades to approximate 
Rohwer’s and Stevenson’s samples. 


D. H. FELDMAN, L. E. JOHNSON, AND T. A. MAST 


Merxop 


Subjects 


Subjects were 125 boys and 114 girls from the 
third and sixth grades of two St. Paul schools, one 
in an upper-middle-class area and one in a lower- 
status area. Both school populations were pre- 
dominantly Caucasian; only Caucasian subjects 
were included in the analyses. 


Task 


Task administration closely followed Stevenson 
et al.; six stimulus-response pairs were used per 
trial, but only three trials (versus Stevenson’s 
six) were given. Tasks were presented by 16- 
millimeter color sound film; subjects responded in 
printed booklets. 


Materials 

In the paired-associate concrete task the stimu- 
lus and response elements were line drawings of 
familiar concrete objects (as in Rohwer, 1971). In 
the paired-associate abstract task the stimulus 
items were nonsense syllables and the response 
items were Japanese kanji characters (as in 
Stevenson et al.). 


Procedure 


The film, presented to intact classrooms, began 
with an announcer’s introduction to the tasks. For 
each item the stimulus appeared for 3 seconds; 
the response element was added for 3 seconds; 
and a 2-second interval separated each of the six 
items in a trial. After each trial, subjects opened 
their booklets to the appropriate page and circled 
the drawing (among six alternatives) in each row 
that went with a drawing on the left side of the 
page. Guessing was encouraged. Items were ran- 
domly ordered on the film in each trial, but order 
was held constant in the booklets. 

The procedure was then repeated for the second 
task. On both tasks, subjects were monitored to 
prevent them from consulting previous pages Or 
each other. Subjects with incomplete data due to 
inattention or tardiness were omitted from the 
analysis. Ede 

Achievement data was gathered from subjects 
cumulative files. For the third-grade subjects, 
Metropolitan Achievement Test (MAT) scores on 
Word Knowledge, Word Discrimination, 8n' 
Reading were recorded from tests administered in 
second grade, and Arithmetic scores from the 
tests administered in first grade. For the sixth- 
grade subjects, Lorge-Thorndike verbal IQ scores 
and Iowa Tests of Basic Skills (ITBS) scores 
(administered in sixth grade) for Vocabulary, 
Reading, Work-Study, and Arithmetic were ob- 
tained. MAT and ITBS scores were selected aS 
measures of school achievement because of their 
demonstrated high intercorrelation among group? 
who have taken both tests (Finley, 1903). 
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TABLE 1 


CORRELATIONS BETWEEN PAIRED-ASsOCIATE TASKS AND ACHIEVEMENT TzsT Scores 
FOR THE THIRD GRADE 


Metropolitan Achievement Test 
Task PA-Concrete 
Word Knowledge | Word Discrimination Reading Arithmetic 
Girls (n = 51) 
PA-Concrete — .04 —.14 —.13 10 
PA-Abstract .29* .29* .31* .27 0.00 
o m E PETI TENTE SUUS KEEN USA 
Boys (n = 61) 
PA-Concrete .08 .06 .07 .08 
PA-Abstract -36** .39** .40** .30* .41** 
Total (n = 112) 
PA-Concrete .07 —.08 —.03 .09 
PA-Abstract .88** .34** .36** .29** s21”. 
* p « .05. 
** p « .01. 
Rzsurrs The correlation between the two paired- 


Correlations were computed within grades associate tasks ranged from .00 for third- 
Since the measures of anie achievement grade girls to .55 for sixth-grade boys (Ta- 
were different for the two grades, and within bles 1 and 2). In the third grade, the ab- 
Sexes since Stevenson et al. reported different  stract paired-associate task correlated sig- 
patterns of correlations for boys and girls. nificantly with achievement scores but the 


TABLE 2 


CORRELATIONS BETWEEN PAIRED-ASSOCIATE TASKS AND ACHIEVEMENT Tust SCORES 
FOR THE BrxrH GRADE 


Iowa Tests of Basic Skills A 
Task IQ Concrete 
'eading Work- 
Vocabulary | ee on | Woa Arithmetic 


eR al Ne na pS ee T TU EUN, 


Girls (n = 63) 
PA-Concrete 
PA-Abstract 
Boys (n = 64) z 
m U E em T E PRIN A S See SENN 
.09 
PA-Conerete —.08 .04 —.00 .12 e 
PA-Abstract 16 .24 ll 26* .24 .55' 
Total (n — 127) 
BEN iw piia 77 0 hx: oS pls 2d 95 2L MENSES UG NU 
e » .19 EP Thad ,28** 
eee, p e :20* | 33 E E 
——— E ies iine HET cute | Kien ie essel i. P 
*p < 05. 
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TABLE 3 


CORRELATIONS BETWEEN PAIRED-ÁSSOCIATE-ÀBSTRACT AND ACHIEVEMENT TEST SCORES BEFORE AND 
AFTER PAIRED-ÁSSOCIATE-CONCRETE WAS PARTIALED OUT FOR THE THIRD GRADE 


Boys 


Girls 


Word Word Dis- ii 
E v led | deer tion | Reading | Arithmetic 


Word Dis- 


Word 
Knowledge crimination 


Reading | Arithmetic 


Correlations before PA-Concrete partialed out 


.36** | E | .40** | .30* | .29* | .29* | .31* | 27 
Correlations after PA-Conerete partialed out 

32" | mo | am | 32" | .29* | -mo| o | ox 

* p « .05. 

**»« 01. 


concrete task did not; in the sixth grade, 
both tasks correlated with achievement scores 
for girls, while there was only one signifi- 
cant correlation (between the abstract 
paired-associate task and ITBS Work- 
Study) for the boys. 

To test the main hypothesis, that the cor- 
relation between paired-associate abstract 
and school achievement is independent of the 
correlation between the two paired-associate 
tasks, paired-associate concrete scores were 
partialed out of the former. This partialing 
process had no effect on the correlation be- 
tween paired-associate abstract and achieve- 
ment (Tables 3 and 4) except in the case of 
sixth-grade girls where the correlation be- 


tween paired-associate abstract and achieve- 
ment dropped slightly. Thus, the hypothesis 
of independence was supported. 


Discussion 


It was shown that for the two paired- 
associate tasks used in the present study, 
differing only in their stimulus and response 
elements (concrete and familiar versus ab- 
stract and unfamiliar), the abstract task 
correlated significantly with school achieve- 
ment, while the concrete task was generally 
not related to school achievement. Further- 
more, the variance shared by the two paired- 
associate tasks was independent of the sig- 
nificant positive correlations between the 


TABLE 4 
CORRELATIONS BETWEEN PAIRED-ÁSSOCIATE-ABSTRACT AND ACHIEVEMENT TEST SCORES BEFORE AND 
AFTER PAIRED-ÁSSOCIATE-CONCRETE WAS PARTIALED OUT FOR THE SIXTH GRADE 


Boys 


Vocabulary EET Work-Study | Arithmetic 


Girls 


Vocabulary 


Work-Study [et 


Readi 
Comprehension 


Correlations before P A-Concrete partialed out 


.24 | 11 | .20* | 24 | .3ge* | o7* | 40** | .53** 
Correlations after PA-Concrete partialed out 
.28* | 3 | 33 | -23 | .25* | 16 | 26* | ot 
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abstract task and school achievement. Thus, 
whatever underlying processes the two 
paired-associate tasks have in common, 
these do not account for the variance relat- 
ing paired-associate performance to school 
achievement. The cognitive processes ac- 
counting for these relationships have yet to 
be adequately demonstrated, but it is clear 
that the two tasks differ with respect to their 
power to predict school achievement in third 
or sixth grades. 

The mental elaboration skills proposed by 
Rohwer to explain successful performance 
on his (concrete) paired-associate task may 
also be related to successful performance on 
abstract paired-associate tasks. The gen- 
erally substantial correlation between 
paired-associate concrete and paired-associ- 
ate abstract, particularly in the sixth grade, 
leave that possibility open. But the correla- 
tions between paired-associate concrete and 
school achievement, with the notable ex- 
ception of the sixth-grade girls are remark- 
ably low. If the effects of mental elaboration 
training do extend to school achievement, as 
Rohwer argues, it will be independently of 
the relation between concrete paired-asso- 
ciate tasks and school achievement measures. 
The correlations between school achieve- 
ment and the two paired-associate tasks. 
indicate that different sets of skills are re- 
quired by the two paired-associate tasks. 
Correlations between paired-associate con- 
crete and paired-associate abstract provide 
evidence for a subset of skills common to 
both paired-associate tasks, but this com- 
mon subset of skills is independent of the 
skills which likewise link paired-associate 
abstract and school achievement. 

The different patterns of correlations be- 
tween paired-associate abstract and ITBS 
Scores for the sixth-grade girls should not be 
ignored. Those differences in the sixth grade 
fairly well replicated the correlation between 
the abstract, forms paired-associate task and 
ITBS scores reported by Stevenson, et al. 
(1968, p. 34) in the seventh grade. At first 
look it appears that the sixth-grade girls, 
More so than any other group, are using the 
Same strategies or processes on both the 
paired-associate tasks and the ITBS tasks. 
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Research effort should be expended to iden- 
tify variables which account for this reliable 
sex difference in the higher grade. However, 
it should be noted that in Table 4 partialing 
out the paired-associate concrete scores does 
not greatly reduce the paired-associate-ab- 
stract/ITBS correlations. A good portion of 
paired-associate-abstract/ITBS correlations 
are independent of the correlation between 
the two paired-associate tasks, even in this 


up. 

Rohwer's attempt to define, operational- 
ize, and train mental elaboration skills repre- 
sents a praiseworthy effort to explain paired- 
associate performance at a process level. 
Future studies would do well to imitate his 
strategy of externalizing processes involved 
in a task through training in the skills pre- 
sumed to account for successful performance 
on that task. The processes that account for 
school achievement, however, remain to be 
empirically established. 
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IMAGINAL FACILITATION OF 
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A LIMITED GENERALIZATION?! 
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A study and its replication are reported in which independent groups 


of sixth graders received visual 


imagery instructions (or regular in- 


structions) prior to learning to associate eight pairs of pictures (or 
words). The findings were that imagery instructions for picture pairs 


were generally more facilitative 


than imagery instructions for word 


pairs, with children at this age exhibiting little variability in their 
capacity to utilize a visual imagery strategy when pictures comprised 


the learning materials. 


The role of visual imagery in children’s 
learning has received considerable attention 
in recent years. An issue of the Psychological 
Bulletin (1970) included papers by four 
prominent psychologists (Paivio, Rohwer, 
Reese, and Palermo) based on an earlier 
symposium concerned with methods, re- 
sults, and problems associated with study- 
ing children’s imaginal processes. A subse- 
quent article (Bugelski, 1970) presented 
research findings on the problem and ap- 
peared in the American Psychologist. Hope- 
fully the interest in imagery displayed by 
such individuals, and the attendant public- 
ity, have served to legitimize investigation 
into a heretofore lowly regarded and pur- 
Portedly messy research area despite oc- 
casional protestations (e.g., Brainerd, 1971). 

Studies of imagery reported to date (us- 
ually within the context of laboratory learn- 
ing paradigms, namely, paired associate) 
may be grouped into two broad categories: 
(a) those in which imagery characteristics 
of the learning materials have been imposed 
Sau. 

‘Sponsored by the Wisconsin Research and 
Development Center for Cognitive Learning and 
Supported in part as a Research and Development 
Center project by funds from the United States 
Office of Education, Department of Health, Edu- 
cation, and Welfare, Center C-03/Contract OE 
5-10-154. The authors are grateful to Linda Mona- 
han for typing the paper. 

* Requests for reprints should be sent to Joel 

: Levin, Research and Development Center for 
Cognitive Learning, University of Wisconsin, 
1404 Regent Street, Madison, Wisconsin 53706. 


on subjects and (b) those in which imagery 
is induced in learners through the use of 
instructional sets. 

Examples of the former line of research 
may be found in the reports of Davidson 
(1964), Reese (1965), Rohwer (1967), Pai- 
vio (1969), and others. In these studies, 
verbal and pictorial paired associates have 
been presented to subjects in different 
forms. The general findings that (a) con- 
crete word pairs are better recalled than 
abstract word pairs; (b) picture pairs are 
better recalled than concrete word pairs; 
(c) interacting (or dynamic) picture pairs 
are better recalled than side-by-side (or 
static) picture pairs have been accounted 
for in terms of concreteness/image evoca- 
tiveness. Better recalled stimulus materials 
(e.g., interacting pictures) are presumably 
high in this dimension, while more poorly 
recalled materials (e.g., abstract words) are 
low. 
At the same time, several studies utilizing 
instructional sets are also noted (e.g., Bo- 
wer, 1971; Bugelski, Kidd, & Segmen, 
1968; Yuille & Paivio, 1968). In these ex- 
periments, imagery is “induced” (via a 
strategy or mnemonic) in subjects prior to 
learning. Subjects who are given the sug- 
gestion to conjure up dynamic visual im- 
ages when associating paired objects regu- 
larly outperform those who are left to their 
own devices. 

Despite the vast amount of data regard- 
ing the efficacy of induced imagery in learn- 
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ing, relatively little can be said about the 
generality of the phenomenon across ages, 
since the successful utilization of imagery 
strategies (as described above) has typi- 
cally been observed among subjects of col- 
lege age. Only a handful of studies may be 
found in which the effect of the imagery 
set variable has been observed among sub- 
jects of elementary school age. 

Although Wolff and Levin (in press) have 
Some recent data to suggest that third-grade 
children can successfully utilize visual im- 
agery strategies in learning to associate pairs 
of toys, these findings are at odds with other 
Studies in which printed words have con- 
Stituted the learning materials (e.g., Hor- 
vitz & Levin, in press). Apparently even 
fifth and sixth graders experience only mini- 
mal imagery gains when having to associate 
printed word pairs (Horvitz & Levin, in 
press; Spiker, 1960; Taylor & Black, 1969). 

"The present experiment was conducted 
using a sixth-grade sample in order to study 
the effect of a variable assumed to be re- 
lated to imaginal facilitation at this age, 
namely, the form in which the to-be-learned 
pairs are presented: either as printed words 
or pictorial representations. 


MxzrHoD 


Design and Materials 


An eight-item paired-associate list was con- 
structed by randomly pairing 16 familiar concrete 
nouns (e.g, BELT-WHEEL). The pairs were 
typed onto 5 X 8 inch index cards in 44-inch type 
and pasted on pieces of 914 x, 114 inch cardboard. 
Static line drawings of the same eight pairs were 
also made and pasted on pieces of 934 X 1114 inch 
cardboard. The two pictures in each pair were 
positioned adjacent to one another. 

Subjects were given one of the two types of 
materials (printed words or pietures) to learn 
under either regular or imagery instructions. All 
Subjects were randomly assigned (in equal num- 
bers) to the four type-of-materials/type-of-in- 
Structions combinations. 


Procedure 


Each subject was tested individually by the 
second author. Prior to learning the list, instrue- 
tions and three examples (with word or picture 
pairs, depending on subject’s condition) were 
provided. Subjects were given standard study- 
test paired-associate instructions, in which they 
were told that they would be asked to recall out 
loud the missing word (or name of the picture) in 
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each pair following a single exposure of all pairs, 
one at a time. In the imagery conditions, subjects 
were additionally instructed to think of a picture 
in their minds “of the two things in each pair doing 
something together.” For each example, a subject 
was asked to describe his imagined picture, fol- 
lowing which the experimenter displayed as a pos- 
sible image a line drawing of the two objects in- 
teracting. 

After the subject indicated that he understood 
the nature of the task, the experimenter presented 
the eight pairs individually at a 5-second rate, 
while they were named aloud on a tape recorder. 
Following a 10-second interval during which the 
subject was reminded of the task, the first word 
(or picture) of each pair was shown to subject at a 
5-second rate, and in a random order different 
from that of the study trial. 


Subjects 


Twenty-four beginning sixth graders from a 
lower-middle-class residential area in the Mid- 
west served as subjects. 


RESULTS 


Learning was measured in terms of the 
number of items correctly recalled after a 
single presentation of the list. The sum- 
mary data, by experimental condition, may 
be found in Table 1, where the maximum 
number possible is eight. Since the effect 
of instructions was of interest within each 
of the presentation modes (pictures and 
words), a simple effects analysis of variance 
was conducted with the probability of a 
Type I error (æ) per hypothesis set equal 
to .05. The results of the analysis revealed 
a significant main effect of materials, pic- 
tures versus words (F = 8.88, df = 1/20). 
In addition, a significant simple main effect 
of instructions was obtained for the picture 
pairs (F = 10.74, df = 1/20) but not for 
the word pairs (F = 1.56, df = 1/20). As 
may be seen in Table 1, there was a 3 
item advantage (44%) in learning picture 
pairs when imagery instructions were pro- 


TABLE 1 
Summary Data ror ExPERIMENT I 
Words Pictures 
Sin N No | Imagery 

ae Imagery | ;, 

imagery i as 
M 2.17 | 3.50 | 3.33 
SD 1.72 1.87 | 1.97 


6.88 à 
1.84 | 
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IMAGINAL FACILITATION OF PAIRED-ASSOCIATE LEARNING 


1 TABLE 2 
Summary DATA FOR EXPERIMENT II 


Words Pictures 
Measure Y 3 
imagery | Dmagery imagery | Imagery 
M 1.50 4.83 2.17 7.00 
SD 1.05 2.56 .98 .63 


vided; on the other hand, the 144 item 
gain (17%) for word pairs with imagery 
instruetions was not statistically signifi- 
cant. 


REPLICATION STUDY 


In order to assess the durability of these 
findings, a second study was conducted on 
an independent sample of 24 sixth graders 
from a semirural community in the Mid- 
west; Identical materials and procedures 
were employed. 

The mean number of items correctly re- 
called in the replication study is found in 
Table 2, along with their corresponding 
standard deviations. An analysis of vari- 
ance of these data once again revealed a 
significant main effect of materials (F = 
5.28, df = 1/20) and a simple main effect 
of instructions for picture pairs (F = 30.97, 
df = 1/20). However, in this experiment, a 
statistically significant simple main effect 
of instructions for word pairs was also de- 
tected (F = 14.72, df = 1/20). As before, 
the instructions difference was descriptively 
greater for picture pairs (about 5 items) 
than for word pairs (314 items). 

Unlike the first study where the vari- 
ability of scores within each condition was 
about the same, in Study II the variance of 
scores in the words-imagery condition (S? 
= 6.57) was over 16 times greater than the 
variance for the pictures-imagery condi- 
tion (S? = .40) which, with df = 5/5, is 
statistically significant with a = .05. 


Discussion 
The present data suggest that the ability 
to use an imagery strategy in conjunction 
er 


? The authors acknowledge the kind assistance 
of William R. Smeaton, Elementary School Prin- 
cipal, Edgerton, Wisconsin. 
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with printed materials is far more a func- 
tion of individual differences at this age 
than is the ability to do so with pictorial 
materials. In the first study (where sub- 
jects were drawn from a relatively low 
socioeconomic area) there was virtually no 
variance in the inability to use imagery 
effectively with word pairs; in the second 
study (with children of higher social class— 
and approximately one-quarter of a year 
older) there was considerable variability in 
the ability to do so. In contrast, children 
in both studies exhibited surprisingly little 
variability in their ability to employ im- 
agery strategies when learning picture 
pairs. 

These findings, coupled with those which 
indicate that sixth graders can use imagery 
effectively to recall sentence-embedded 
paired associates (Horvitz & Levin, in 
press), might be interpreted with regard to 
the complexity that the imagery task poses 
for subjects. When pictures constitute the 
learning materials, in order to create a dy- 
namic image the subject simply must relate 
the two pictured objects. When pairs are 
presented in sentences, the relationship is 
already specified; the subject simply has to 
picture it. For each of these types of ma- 
terials, it will be noted that there is only one 
transformation which the subject must ac- 
tively perform: in the former case, he must 
transform the static picture into a dynamic 
picture; in the latter, he must transform 
the dynamic description into a dynamic 
picture. (i 

When word pairs comprise the stimulus 
materials, however, the imagery task re- 
quires two transformations on the part of 
the subject: first, he must conjure up & 
static picture of the two objects; and then, 
he must relate them to one another. If this 
interpretation is correct, it is not surprising 
that imagery instructions for words are 
probably not as facilitative as they are for 
pictures and sentences. This is not to say 


——— 

4 These results are especially interesting when 
considered along with the finding that nonimaged 
picture pairs were more easily learned than non- 
imaged word pairs. On an eight-item list there 
was, therefore, less room for the imaged picture 
pairs (as compared to the imaged word pairs) to 


be facilitative. 
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that imagery for words is ineffectual for 
children of elementary school age, as evi- 
denced by the variability obtained in this 
condition. Furthermore, given sufficient en- 
coding (transformational processing) time, 
that is, longer than 5 seconds, the differ- 
ences between imagery for pictures and 
imagery for words may diminish. 

Thus, the facilitative effect of induced 
visual elaboration among elementary school 
children might be another instance of a 
limited generalization, being conditioned 
upon the form in which stimulus materials 
are presented. This parallels an earlier 
finding concerned with the facilitative effect 
of imposed verbal elaboration among chil- 
dren of the same age (Levin, Horvitz, & 
Kaplan, 1971). 

At the present time, far from complete 
information has been gathered regarding 
the development of learner-initiated (or in- 
duced) imagery in children. Certainly not 
as much attention has been devoted to this 
topic as has been invested in imposed im- 
agery studies, Yet, several gaps are in need 
of filling, especially if one believes that the 
individual difference by learning strategy 
by mode of materials interaction (as an 
extension of aptitude by treatment inter- 
actions) is of import to classroom instruc- 
tion, as do we. 
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One hundred and forty-nine children in Grades 1, 2, and 3 were ad- 
ministered two tests designed to assess knowledge of English sound- 
symbol relationships. One test involved the recognition of the ap- 
propriate grapheme, given a phoneme as a stimulus, while the second 
test required children to judge whether a given grapheme could be 
employed to produce a partieular phoneme. Data revealed develop- 
mental trends in knowlege of both phoneme-grapheme and graph- 
eme-phoneme correspondences and also indicated that such associa- 
tions may not be entirely symmetrical. In general, performance was 
better on the phoneme-grapheme than on the grapheme-phoneme 
test. A subsequent factor-analytic treatment of the data suggested 
that knowledge of sound-symbol relationships may be influenced by 


a complex multidimensional set of factors. 


It is generally accepted that the formation 
of correspondences between letter patterns 
and the sounds for which they stand is an 
important part of the reading process 
(Hardy, Stennett, & Smythe, 1970). The 
work of Birch and Belmont (1965) and 
Muehl and Kremenak (1966) indicated 
rapid growth in mastery of visual-auditory 
equivalences occurring between the ages 5 
and 7. Calfee, Venezky, and Chapman 
(1969), who included students from Grade 3 
to college level in their study, concluded that 
increases in mastery of letter-sound cor- 
Tespondences occurred as far as the Grade 11 
level of secondary school. 

The current study sought to assess the 
development of the mastery of phoneme- 
grapheme (P-G) and grapheme-phoneme 
(G-P) correspondences in young school-age 
children, as well as to evaluate two tests 
—— 

! This study was supported in part under Can- 
ada Council Grant No. 870-0723. 

8 ? Requests for reprints should be sent to P. C. 
mythe, Educational Research Services, P.O. Box 
5873, London 12, Ontario, Canada. 


developed specifically for the study, namely, 
a test requiring identification of the ap- 
propriate grapheme, given the corresponding 
phoneme, and another calling for recogni- 
tion of the correct phoneme when the cor- 
responding grapheme is presented. A sum- 

of these results has appeared else- 
where (Stennett, Smythe, Hardy, Wilson, & 
Thurlow, 1970); more detailed consideration 
is presented here. 


METHOD 


Subjects 

The 149 pupils involved in this study were en- 
rolled, in Grades 1 through 3, in the Chesley 
Avenue School. This school is located in the inner 
city and contains a high proportion of children 
from lower socioeconomic backgrounds as well as 
some children from homes where the language used 
is not English. A more complete description of 
the subject population may be found in Stennett, 
Smythe, Hardy, Wilson, and Thurlow (1970). 


3 The authors would like to acknowledge the 
cooperation and assistance of Allan Speare, Prin- 
cipal of Chesley Avenue Public School, and his 
staff during the course of this study. 
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Testing 

The 33 elements included in the phoneme- 
grapheme and grapheme-phoneme tests included 
20 consonants, 3 of which had two phoneme equiv- 
alents each, and 10 vowels. 

The G-P association test used a simple device 
which allowed the examiner to present the 33 lower- 
case letters, 1 at a time, through a small window, 
in synchrony with a tape recording of the corre- 
sponding phonemes. The examiner exposed a single 
letter and said, *'See this letter. Now, listen. . . . 
Does this letter say ...?" The examiner then 
played a sound from the tape and asked the child 
to indicate whether or not the sound he heard 
“went with” the letter he was being shown. Three 
presentations of the 33 letters were made, each 
letter being paired with its correct phoneme on 
only one of the three presentations. For the young- 
est group of pupils the three presentations were 
interspersed among other tests to reduce the at- 
tentional requirements. In scoring the test, a 
pupil was considered to have completed an item 
satisfactorily if he both identified the appropriate 
letter-phoneme pair as correct and did not mis- 
identify either of the other two presentations as 
correct. The recognition test format described was 
selected in preference to one that would require 
production of the sound for each letter, since such 
a “production” test might tend to make success 
dependent upon articulatory and/or auditory 
discrimination skills. 

In the P-G association test, each of the 33 
letters used in the G-P test was shown to the sub- 
ject along with 3 distractor letters. 

The following instructions were given by the 
examiner: 


See this long box which has four letters in it. 
Listen to this sound and put a circle around 
the letter that goes with the sound. 


The examiner then played a sound from the tape 
recorder and the subject marked the item. The 
position of the correct response was randomly 
assigned among the four available positions. In 
order to minimize the visual discrimination re- 
quirements of the test, letters representing in- 
correct responses were selected so that they were 
as visually dissimilar as possible to the correct 
letter. Dissimilarity was determined by a ranking 
procedure in which six adults had been Tequired 
to identify and rank order, in terms of visual simi- 
larity, the five letters which were most similar to 
each of the letters of the alphabet (see Stennett 
et al, 1970, for details). Test responses were 
scored as simply correct or incorrect, 

Both tests were administered individually to 
children in Grade 1 while a group form of the test 
was given in Grades 2 and 3. 


Data Handling and Statistical Analysis 


Data from the tests were coded, keypunched 
into cards, verified, written onto magnetic tape, 
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edited, and corrected. All subsequent analyses 
were performed on an IBM 7040 computer. 

In order to examine "developmental" trends, 
the children were divided into three chronological 
age groups, namely 77-90, 91-104, and 105-132 
months, containing 50, 51, and 48 students, re- 
spectively. The mean chronological ages of each 
of these groups when tested were 84.0, 97.1, and 
112.8 months. In Table 1, the three groups are 
referred to as 1, 2, and 3, respectively. 

Two types of analysis were conducted. In the 
first, the percentage of students in each age group 
responding correctly to each item in the P-G 
and G-P tests was calculated. In the second type 
of analysis, the intercorrelations of all 66 items 
from the two tests were calculated and the re- 
sulting correlation matrix was factored.‘ 


REsvuts 


Table 1 gives the results of the tests of 
phoneme-grapheme and grapheme-phoneme 
associations rank ordered in terms of total 
group performance on the phoneme-graph- 
eme test. 

At all levels tested, scores for phoneme- 
grapheme associations tended to be higher 
than those for the corresponding grapheme- 
phoneme associations; that is, it was easier 
for children to identify the letter associated 
with an uttered phoneme than to identify 
the appropriate uttered phoneme when 
shown a letter. Nevertheless, knowledge of 
both phoneme-grapheme and grapheme- 
phoneme associations revealed develop- 
mental trends. For phoneme-grapheme as- 
sociations, mean percentages correct were 
88, 97, and 98 for Age Groups 1, 2, and 3, 
respectively, while for grapheme-phoneme 
associations, the corresponding percentages 
were 79, 83, and 88. «(te 

Although phoneme-grapheme associations 
were relatively easier for the subjects of 
this study than were grapheme-phoneme 
associations, the easiest associations Ap- 
peared to be common to both tasks. Long 
vowel sounds, whose letter names and 
sounds are identical, consonants with which 
only one sound is associated, and the short 
/&/ sound, whose sound alone constitutes a 
word, were the easiest elements in both 
tasks. 


4 Results of the factor analysis are rn 
tized only briefly in this report. A more detail : 
description of these findings is available upon T€ 
quest from the authors. 
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TABLE 1 
PERCENTAGES OF ÜHILDREN, BY ÜHRONOLOGICAL 
Ace Group, Wuo Have LEARNED PHONEME- 
GRAPHEME AND GRAPHEME- 
PHONEME ASSOCIATIONS 


Phoneme-grapheme Grapheme-phoneme 
Letters 

a ee | tes e 

id 2 3 iy 1 2 3 E» 
o-o |98 |100 |100| 99 |96|86|98| 93 
8-s |96 | 100 | 100 | 99 |85 |92| 94] 90 
aii |98 |100 |100| 99 |94|86|87| 89 
m-m |98 | 100 | 100 | 99 | 94] 82] 70] 82 
rr | 94} 100 | 100} 98 |96|90|94| 93 
b-b |92 |100 |100| 97 | 90] 98] 98] 95 
2-8 90 | 100 | 100} 97 |90|94|98| 94 
t-t | 90 | 100 | 100 | 97 | 85] 96] 98] 93 
d-d |94| 98/100} 97 | 87 | 90 1100| 92 
ie | 94] 98] 100] 97 | 90] 84] 98] 91 
e-a |96| 96 |100| 97 | 87] 88| 94] 90 
ff | 92] 100] 98] 97 |90|84|83]| s6 
w-w | 90} 100] 98| 95 | 88] 92] 98] 93 
z-z |90| 100] 98| 96 | 88] 94] 98] 93 
gg | 90] 98] 100] 96 | 81| 96/96] 91 
P-P | 90 |100) 98| 96 | 81] 92] 94] 89 
dig 100] 91} 98] 96 | 75} 58] 80 | 71 
8c |93| 94/100] 96 | 60] 42] 65] 56 
zs |93 |100| 96| 96 | 65] 46| 37] 49 
dej |86 | 100 | 100 | 95 | 83] 88] 98} 90 
kk | 89] 96/100] 95 | 81 | 92] 98} 90 
on |90| 96] 98| 95 | 83| 90] 91] 88 
ju-u |85| 98 |100 94 | 88] 82] 98] 89 
ke |88 |100| 95| 94 | 79| 90] 85] 85 
ll |81| 98| 98| 92 | 73} 84] 89] 82 
hh | 78] 96/100] 91 | 81|86| 96] 88 
Jy |80| 96} 98] 91 | 58] 78| 94] 77 
Au |76| 98] 100} 91 | 60} 83} 85] 76 
kex |86| 94] 93] o1 | 42] 45] 46) 44 
ee |74| 95] 95] 88 | 58] 84] 76 | 73 
Li [64| 98] 98] 87 |56]75|89| 73 
vy |74| 88] 94| 85 | 88|78| 89] 85 
9o |66] 88| 96| 83 | 60] 82] 83] 75 
M% 88| 97| 98 79 | 83 | 88 


The most difficult phoneme-grapheme as- 
Sociations were the short vowel sounds /&/ 
/8/ /i/ and the consonant /v/. Difficult 
grapheme-phoneme associations were the 
short vowels /i/ /8/; three consonants with 
which two sounds are associated ((c) — /s/ 
or /k/; (g) — /g/ or /d3/; (s) — /s/ or /2/) 
and the consonant (x) — /ks/. 

To examine the relationship between P-G 
and G-P test performance, a rank-order cor- 
relation was calculated based on total group 
Performance on both tests. The resultant 
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rho of .59 (p < .01) suggests a substantial 
relationship between the two tests, but there 
is evidence that the skills involved in these 
two are not identical. Put another way, it 
may be stated that the associations repre- 
sented by P-G and G-P test performance 
are not completely symmetrical. 

In an earlier study, Stennett, Smythe, 
Hardy, and Wilson (1970) caleulated the 
total frequencies with which the beginning 
and ending letters of the first 500 words in 
these students' preprimer were associated 
with the various phonemes. The rank-order 
correlation between these frequencies and 
the percentages of the total group of stu- 
dents responding correctly to corresponding 
items in the P-G and G-P tests are .37 and 
44, respectively. These results indicate that 
students’ mastery of any particular pho- 
neme-grapheme or grapheme-phoneme rela- 
tionship is not very closely related to the 
frequency with which it appears in their 
early reading materials. ' 


Factor Analysis 

The results of the factor-analytic treat- 
ment of the present data revealed a highly 
complex pattern of 12 factors. Contrary to 
expectations, the skills or abilities measured 
by the two tests did not result in one, or 
even two, unidimensional factors, That is, 
there does not appear to be either a simple 
“knowledge of phoneme-grapheme” or a 
"knowledge of grapheme-phoneme” cor- 
respondence skill underlying the test per- 
formances obtained. Separate analyses car- 
ried out on the P-G and the G-P test: per- 
formance support this finding, with the 
P-G and G-P tests resulting in six and five 
factors, respectively. 


Summary AND DISCUSSION 


The present findings are in agreement with 
those of earlier investigators (cf, Birch & 
Belmont, 1965; Calfee et al., 1969; Muehl & 
Kremach, 1966) in revealing clear develop- 
mental trends in the mastery of visual-audi- 
tory equivalences. Unlike the earlier studies, 
however, the present one also produced evi- 
dence to suggest that these associations may 
have directional properties. Thus, in terms 
of the total group performances of the 


436 


present sample, the associations from. pho- 
neme to grapheme appeared to be considera- 
bly easier than the associations from graph- 
eme to phoneme. Alternately, it is possible 
that the two tests differed from each other 
sufficiently in both format and task demands 
that direct comparisons of performance on 
them cannot be validly made. In the P-G 
test, subjects were required to make a choice 
from among four alternatives whereas in 
the G-P test, subjects merely made a yes-no 
choice. Although both tasks used a recogni- 
tion procedure in an attempt to ensure re- 
sponse availability, recent evidence re- 
ported by Tulving and Thomson (1971) sug- 
gests that retrieval problems may exist 
even in recognition memory tasks. It be- 
comes obvious, therefore, that a critical 
aspect of future research in this area will be 
the development of some technique for as- 
sessing the availability of the various pho- 
neme and grapheme forms for each subject 
80 that this dimension can effectively be 
controlled or varied as the need arises. 
Future research in the present program 
will be aimed at investigating the relation- 
Ship between knowledge of symbol-sound 
relationships and the decoding aspect of 
beginning reading. Emphasis will accord- 
ingly be placed on the nature of the graph- 
eme to phoneme associations. As well as the 
type of recognition tests discussed here, 
development is underway on tests which 
will involve subjects' ability to actually 
produce the appropriate phoneme, given a 
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grapheme. Other extensions of these tests 
will include more complex symbol-sound 
relationships involving blends, digraphs 
and syllables. 
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4 BEHAVIORAL INDEX OF THE EXPLORATORY VALUE 


OF PROSE MATERIALS 


LARRY T. BROWN? 
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Two experiments were aimed at indexing the extent to which prose 
materials elicit curiosity and induce further reading. The procedure 
was based on an extension of methods and concepts from the field of 
exploratory behavior: (a) The subject was presented with a set of al- 
ternatives consisting of short excerpts from works of prose and asked 
to choose among them for purposes of extended reading outside the 
laboratory; (b) the subject’s prechoice exploration of the alternatives 
was monitored; and (c) the subject’s verbalized choices were ranked 
and recorded, Examination of the subject’s prechoice exploration re- 
vealed that the number of sentences sampled from an alternative may 
under certain conditions enable prediction of the alternative’s ranked 
choice value and, hence, the degree to which further contact with it 


is welcomed. 


By the end of the 1950’s the study of 
exploratory behavior was well established 
within the mainstream of psychological re- 
search and theory (Berlyne, 1960; Fiske & 
Maddi, 1961; White, 1959). However, de- 
spite the continuing interest of several 
workers (e.g., Harrison, 1968; Hutt, 1967), 
activity in the area over the last decade or 
80 has not maintained its earlier momen- 
tum. One manifestation and possible cause 
of this has been the relatively narrow range 
of stimulus materials investigated: While 
the close relationship between the explora- 
tion of visual patterns and the exploration 
of written materials has not gone unnoticed 
(see, for example, Berlyne’s, 1963b, discus- 
Sion of “epistemic behavior"), researchers 
have generally continued to focus their in- 
terest on the former and to leave the latter 
unexplored. As a consequence, much has 
been learned about those properties of vis- 
ual patterns and objects which are most 


‘The author wishes to thank Phillip H. K. 
Seymour for his many thoughtful suggestions 
Concerning the design of the experiments reported 
in this paper. He also wishes to thank Robert F. 
Stanners and Robert J. Weber for their helpful 
comments on a preliminary draft of the paper. 

Requests for reprints should be sent to Larry 

- Brown, who is now at the Department of Psy- 
chology, Oklahoma State University, Stillwater, 
Oklahoma 74074, 


likely to elicit further viewing or manipula- 
tion (e.g, Brown & Gregory, 1968; Piel- 
stick & Woodruff, 1968), but little has been 
learned about those properties of textual 
and other prose materials which are most 
likely to arouse the curiosity of the student 
and induce him to read further. 

Broadly speaking, of course, reading is 
the exploration of written materials, and so 
from several points of view, studies of eye 
movements and other responses occurring 
while reading may be identified as studies 
of exploratory behavior. To illustrate, 
Thomas and Augstein (1968) have studied 
the ways in which students "explore" tex- 
tual materials and have shown that the pat- 
tern of reading depends on the nature of the 
material which the individual is attempting 
to learn (e.g, facts or principles). Aside 
from the stimulus materials used, however, 
there are several important differences dis- 
tinguishing studies of reading (such as 
those of Thomas and Augstein) from more 
conventional studies of exploratory behav- 
ior and motivation; brief mention of two of 
these should serve to illustrate the kind of 
approach with which the present paper is 
concerned. 

First, in studies of reading the subject is 
typically extrinsically motivated, which 
means in essence that the experimenter 
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rather than the subject determines what is 
to be explored. Studies of exploratory be- 
havior, on the other hand, are more con- 
cerned with intrinsically motivated explora- 
tion, and the common approach has there- 
fore been to compare and evaluate the 
effectiveness of different stimulus properties 
in eliciting exploration by allowing the sub- 
ject some degree of choice among alterna- 
tives. It is true that extrinsically motivated 
stimulus exposure constitutes a major and 
important form of exploratory behavior, 
but, though the extrinsic-intrinsic dichot- 
omy may often be a relative one (Berlyne, 
1963b, p. 289), it has by no means been 
established that the variables underlying 
the two are comparable (see, for example, 
Welker, 1961, for a discussion of “free” vs. 
“forced” exploratory behavior). Moreover, 
the intrinsic case has been by far the more 
interesting and puzzling of the two. 

Second, studies of reading have shown lit- 
tle concern with the relationship between 
patterns of reading behavior and curiosity. 
The rapid scanning of a paragraph might 
indicate, for example, that the individual 
either finds the material boring or that he 
finds it so interesting that he is anxious to 
find what the next paragraph holds in store. 
Or perhaps the wording and vocabulary of 
the paragraph are so simple that compre- 
hension is readily and quickly achieved. 

In light of these and other considerations, 
a research program was undertaken to de- 
velop a measure of the exploratory or cu- 
riosity value of reading materials which 
might help pave the way for the experimen- 
tal analysis of factors underlying reading 
interest. The following experiments report 
the rationale and outcome of this endeavor. 


EXPERIMENT I 


Probably the most commonly used meas- 
ure of exploratory behavior is the relative 
amount (time or frequency of exposure) of 
exploration directed at each of a set of 
stimulus complexes. However, when the al- 
ternatives are written materials, the stimu- 
lus complexes become chapters from text- 
books, articles from periodicals, and the 
like; and for experimental purposes it may 
be desirable to author or edit the materials, 
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systematically varying the content and 
earefully controlling grammatical and other 
linguistie variables. Therefore, unless each 
alternative contained no more than a few 
sentences (thereby encouraging the subject 
to read each in its entirety), it would ob- 
viously become necessary to construct, per- 
haps word by word, passages comprising 
thousands of words. Another, less com- 
monly used approach was consequently se- 
lected: the extent to which a brief exposure 
to an alternative elicits further examina- 
tion, Using visual patterns, for example, 
Berlyne (1963a) presented subjects with 
short exposures to two alternatives and re- 
corded the pattern chosen for further exam- 
ination. It is important to note that the 
measure of choice or exploratory value in 
this case is taken after a short preexposure 
and before more extensive examination be- 
gins. When extended to a situation involv- 
ing written materials, such a procedure 
might entail the presentation of several 
brief excerpts with the request that the sub- 
ject choose among them for purposes of fur- 
ther exposure, This procedure would elimi- 
nate the necessity of designing lengthy pas- 
sages and requiring the subject. to read 
them under controlled laboratory condi- 
tions. The subject's choices would amount 
to an ordinal listing of preferences, but if 
some quantifiable aspect of the subject’s be- 
havior during the prechoice period should 
be found to correspond to -his choices for 
further exposure, the way should be clear 
for the development of a numerical measure 
of exploratory value. 3 

In a series of pilot observations subjects 
were therefore given excerpts from different 
kinds of reading matter (e.g., textual mate- 
rial, “light” prose) and asked to rank order 
them for purposes of extended reading out- 
side the laboratory. Examination of sever 
aspects of the subjects’ prechoice perusal of 
the materials suggested that a simple count 
of the pages read from each excerpt may 
under certain conditions enable prediction 
of the subject’s subsequent choice rankings: 
There was also some suggestion that the 
sensitivity of the page count as a prec ctor 
of choice value may depend on the subject's 
familiarity with the alternatives: When 
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seconds exposure to each alternative was al- 
lowed before informing the subject that the 
experiment would call for a formulation of 
choices, the number of pages examined fol- 
lowing the choice instructions clearly re- 
flected the subject’s subsequent order of 
verbalized preferences. With no such preex- 
posure, however, or with amounts of preex- 
posure exceeding 30 seconds, the page 
counts showed relatively poorer choice dif- 
ferentiation. 

With the general aim, then, of developing 
a numerical index of the exploratory value 
of prose materials, the present experiment 
was designed to examine more closely the 
relationship between numbers of pages “ex- 
plored" by subjects while choosing among 
works of prose for purposes of later reading 
(page scores) and verbalized order of pref- 
erence. The experiment was also designed 
(a) to examine the effects of alternative 
content on the relationship between page 
Scores and preference rankings and (b) to 
examine the possibility that the relationship 
between page scores and preference rank- 
ings is in part dependent on sex-related fac- 
tors. The latter concern was prompted by 
pilot observations suggesting that the page 
Score may be more predictive of reading 
preferences among women than among men. 
The former arises from the question as to 
whether, and in what ways, the subject 
matter of prose materials need be taken 
into account when using amount of pre- 
choice exploration to index exploratory in- 
terest. There is the possibility, for example, 
that some types of subject matter may be 
very popular, but nevertheless elicit rela- 
tively little examination in a choice situa- 
tion; at the same time, other kinds of mate- 
rial may be potentially unpopular, but the 
chooser may require long and studious ef- 
fort before finally arriving at a negative 
evaluation. In the present study, therefore, 
Seven alternatives representing a wide range 
of subject matter were used, and a search 
was made for discrepancies between level of 
prechoice exploration, on the one hand, and 
verbally indexed exploratory interest, on 
the other, which might be attributed to con- 
tent-related variables. 
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Method 


Subjects. The subjects were 14 male and 14 
female volunteers from first- and second-year psy- 
chology classes at the University of Dundee. 

Stimulus materials. Excerpts from seven short 
stories constituted the stimulus materials. To 
reduce the possibility of recognition by the sub- 
ject, only stories by non-British authors were 
selected. Two stories were by American authors 
(Bierce and Thurber); the rest were translations 
from works by Danish, French, German, Argentine, 
and Soviet authors (Dinesin, de Maupassant, 
Lettau, Borges, and Nagibin, respectively). Each 
of the seven sets of excerpts comprised 20 sen- 
tences. The sentences were selected by the experi- 
menter to convey as faithfully as possible the 
features characterizing each of the plots. Proper 
names were abbreviated to capital letters, and 
words that might betray the nationality of the 
story were deleted, as were lengthy constructions 
irrelevant to the plot. Each of the 20 sentences 
from each story was typed on a separate sheet of 
paper, and the 20 sheets were bound in one of 
seven different colored binders. The order in which 
the sentences from each story were arranged in the 
binders was scrambled to eliminate continuity 
and thereby encourage the subject to sample and 
to avoid a thorough reading through of each set of 
excerpts from start to finish. 

Apparatus. The basic apparatus consisted of a 
wooden desktop book stand, a small control panel 
on which were mounted eight button switches, 
and a Miniscript Z event recorder. Seven of the 
button switches were coded with a color cor- 
responding to the color of one of the passages; the 
eighth switch was used to control the movement of 
the tape in the event recorder. The switches were 
silent in operation and required minimal finger 
movement for activation. The control panel was 
located in a small compartment, hidden from the 
subject's view, at the base of the book stand. The 
book stand in turn was casually placed on the 
experimenter’s desk behind and slightly to the side 
of a desk blotter, so that by resting his arms on the 
blotter the experimenter could comfortably place 
one hand in the hidden compartment and con- 
trol the switches. The wires connecting the con- 
trol panel with the recorder traveled beneath the 
blotter and down the experimenter's side of the 
desk, again out of the subjects view, to a sound- 
proofed desk drawer which housed the recorder, 
Each of the seven color-coded switches controlled 
one channel of the recorder. 

Procedure. The seven passages were placed in 
& row on a small table adjacent to and directly 
facing the experimenter’s desk, To balance the 
left-to-right location of the binders across subjects, 
14 orders were constructed with the restriction 
that each set of excerpts occur in each of the 
seven locations in two different orders. Within the 
limits of this restriction, however, the orders were 
randomly constructed. Each of the orders was used. 
for one male subject and one female subject. Three 
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procedural stages were used; & familiarization 
stage, a prechoice stage, and a choice stage. 

In the familiarization stage the subject was 
allowed to read the passages before being in- 
formed that a choice for further reading would be 
requested. As noted above, the best page-score 
predictors of reading preferences among pilot sub- 
jects were obtained following 30 seconds pre- 
exposure to each alternative. Since 4 sentences out 
of 20 were judged to be roughly comparable to 
the proportion of “information” acquired by pilot 
subjects in 30 seconds (the alternatives used in 
pilot observations contained about 60 sentences 
each), preexposure in the present study was 
limited to four pages per alternative. The instruc- 
tions for the familiarization stage were as follows: 


Before you are placed excerpts from seven 
short stories. You will note that excerpts from 
different stories are enclosed in binders of 
different colors. I would like you to read the 
first four excerpts (i.e., first four typed pages) 
from each story, starting with the one on your 
left. I have inserted a blank sheet after the 
fourth page in each binder; it will therefore 
not be necessary to worry about counting the 
pages in order to avoid going beyond the 
fourth. 

You will not be examined on what you 
read, but you will find the information useful 
in the second stage of the experiment. You 
might therefore find it advisable to examine 
the excerpts rather carefully. There is no time 
limit, so spend as much time as you wish. 


As the subject read the materials, the experi- 
menter read from a book located on the reading 
stand. After completing the sentences, the subject 
was asked if he (she) had recognized any of the 
excerpts. Only one subject indicated that a story 
(Thurber’s) “seemed familiar.” 

The instruction sheet for the prechoice stage 
stated that the experiment was concerned with 
reading enjoyment; that the subject was to choose 
one of seven “very different kinds of stories” for 
more extended and leisurely reading elsewhere; 
and that the choice should be made by reexamin- 
ing the seven sets of excerpts, which were in fact 
excerpts from the stories forming the choice. The 
subject was also told that a questionnaire was to 
be filled in after reading the story chosen, but that 
the reading should not be approached “with the 
idea of a subsequent questionnaire in mind” and 
that the questions would be “concerned only with 
very general kinds of reaction.” (The questionnaire 
was used solely to ensure that there was no doubt 
that the final choice entailed a definite reading 
commitment. This was deemed important as there 
are data [e.g., Berlyne & Lewis, 1963] which sug- 
gest that simple preference rankings of stimulus 
materials may or may not reflect the amount of 
time which subjects are in fact willing to devote to 

the materials.) 

A second choice was also asked for under the 
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pretext that the first choice might “be from a book 
which is out at the moment.” In actual fact a 
second choice was requested to head off the 
tendency, found in some pilot subjects, to make a 
single, first choice solely on the basis of prior 
familiarization (requests for two choices nearly 
always resulted in reexamination of the passages). 
No mention was made of third or subsequent 
choices, as this may have led to lengthy examina- 
tion, and hence artifactually high page scores, of 
the less favored alternatives. The final paragraph 
of the instructions was carefully worded to dis- 
courage subjects from the practice, again observed 
in some pilot subjects, of studiously examining 
every page of every alternative before verbalizing 
a choice: 


Since you are already acquainted with the 
general nature of the excerpts, there is of 
course no need to read systematically back 
through them—just glance back and forth 
among those which most interest you and give 
me your first and second choices. 


The instructions were followed by a request to 
avoid concern with author-related considerations 
and to concentrate solely on content. While the 
subject examined the excerpts, the experimenter 
surreptitiously recorded every page turned by 
pressing the appropriate button on the concealed 
panel. 

The choice stage began with the subject's ex- 
pression of his choices. The experimenter then 
switched off the recorder and asked the subject 
to list his third through seventh choices 85 well. 
The book corresponding to the subject’s first or 
second choice was taken from a desk drawer, & 
copy of the questionnaire was folded and inserted 
in it, and the subject was asked to read the story 
and return both the book and completed ques- 
tionnaire within the following two or three days. 

Finally, the subject was told that his page turn- 
ing had been monitored and was asked whether 
he had been aware of this. Several subjects m- 
dicated that they had been “suspicious,” but no 
subject reported awareness of either the con- 
cealed bookstand compartment or the experi 
menter’s recording activity. Special emphasis was 
placed on the importance of the request that the 
experiment not be discussed with others until all 
subjects had been tested. 


Results 


The total number of pages examined n 
each collection of excerpts was tallied for 
each subject, yielding seven page scores for 
each of the 28 subjects. The tally included 
pages examined on two or more occasions, 
so that scores exceeding 20—the number 0 
excerpts per collection—were possible. Fig- 
ure 1 shows the mean score, calcula! 
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MEAN PAGE SCORE 


Ul D n eru 
CHOICE RANKING 


, Fic. 1. Mean number of pages read as a func- 
tion of order of choice (Experiment I). 


across subjects of both sexes, corresponding 
to each of the seven verbal rankings. 

To evaluate the relationship between 
choice ranking and mean number of pages 
tead, the page scores were evaluated using a 
2 X 7 analysis of variance, with sex consti- 
tuting the first factor and order of choice, 
on which repeated measures were taken, the 
Second. 

The mean score for the men (9.15) did 
not differ significantly from that of the 
Women (7.92), nor did the sex factor inter- 
act significantly with order of choice (Fs < 
100, df = 1/26 and 6/156, respectively). 
However, the tendency, shown in Figure 1, 
for mean page scores to decline as verbal 
rankings declined proved to be statistically 
reliable (F = 10.74, df = 6/156, p < .001). 
The results of the Newman-Keuls test were 
as follows: The means for the first and sec- 
Ond choices were significantly greater than 
those for the five remaining choices (p < 
05), but did not differ from each other. The 
means for the third and fifth choices were, 
Moreover, significantly greater than that for 
the seventh choice (p < .05), but none of 
the remaining differences were statistically 
Teliable. 

To determine whether some stories re- 
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ceived more favorable mean rankings than 
others the choice rankings were examined 
by means of two Friedman two-way (Stories 
X Subjects) analyses of variance of ranks, 
one for the men and one for the women, 
Evidence for group preferences among sto- 
Ties was revealed for the men (x? = 13.36, 
df = 6, p < .05), but not for the women (x? 
= 10.16, df = 6, p < .10). 

The page scores were also examined for 
evidence of story preferences by two sin- 
gle-factor (stories) analyses of variance for 
repeated measures, one again for the data 
of the men and one for those of the women. 
Consistent with the analysis of the ranking 
data, no evidence for story preferences was 
indicated for the women (F = 1.48, df = 
6/78, p > .10). The F value for the men, 
however, while greater than that for the 
women and therefore consistent in direction 
with the results of the ranking analysis, fell 
short of the .05 level of significance (F = 
1.92, df = 6/78, p < .10). Table 1 presents 
the mean ranking and page score of the men 
for each of the seven stories. The rank or- 
ders of the two sets of means (given in 
brackets) show good correspondence for six 
of the stories, with each differing by one 
place. It can be seen, however, that a some- 
what larger discrepancy occurred for the re- 
maining story (Thurber): Although this 
story received the highest mean page score, 
it was ranked as relatively unpopular. A 


TABLE 1 
Mean RANKINGS AND Pace Scores or MEN FOR 
Seven STORIES: EXPERIMENT I 


Story author Mean ranking | Mean page score 
de Maupassant 2.6 (1) 10.71 (2) 
Bierce 3.4 (2) 9.79 (3) 
Borges 3.6 (3) 9.50 (4) 
Lettau 4.1 (4) 8.64 (5) 
Thurber 4.3 (5) 10.93 (1) 
Nagibin 4.9 (6) 6.14 (7) 
Dinesin 5.1 (7) 8.36 (6) 


Note.—Numbers in brackets give rank ordering 
of means. 
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possible reason for this lack of agreement 
between the two measures will be advanced 
in the discussion below. 
Discussion 

The results of this experiment, in agree- 
ment with pilot observations, suggest that 
when subjects are given collections of ex- 
cerpts from works of prose and asked to 
choose among them for further reading, a 
simple count of the pages read from each 
collection may yield a set of scores reflect- 
ing the subjects’ subsequent choice rank- 
ings. The finding that the relationship be- 
tween verbal rankings and number of pages 
read was independent of the sex factor 
argues against any basic sex differences in 
the way written materials are examined. It 
is still possible of course that some types of 
content may lead to extensive examination 
but low ranking, or vice versa, and that 
such discrepancies may be sex-related. For 
the more general case, however, the results 
of the present study indicate that the be- 
havioral indexing of reading interests is rel- 
atively unaffected by the sex of the sub- 
ject. 

The question of sex differences as a meth- 
odological problem is, of course, quite dif- 
ferent from the question of sex differences 
as a variable in reading preferences, 
whether verbally or nonverbally indexed. 
The analyses of both the ranking and the 
page-score data in the present study sug- 
gested that the men, but not the women, 
preferred some stories over others. The fact 
that the women as a group did not show 
preferences may be important, for it rules 
against the possibility—at least for the 
women—that the high page scores asso- 
ciated with the first and second choices 
were story- rather than choice-specific. In 
other words, it appears that the highest 
page scores were due to the choice value as 
such of all seven stories and not to stylistic 
and other features unique to one or two 
more favored stories (each of the seven sto- 
ries was ranked as first or second in choice 
by at least two women). With regard to the 
men’s data, it is to be expected on the basis 
of the significant relationship between order 
of choice and mean number of pages read 
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that the results of the page-score analysis 
should mirror to some extent those of the 
ranking analysis. An interesting diserep- 
ancy did arise, however, in the case of the 
story by Thurber, a story which elicited 
more examination than its relatively low 
mean ranking would predict. 

At least one explanation for this discrep- 
ancy suggests itself: Several of the Thurber 
excerpts were clearly humorous, commonly 
evoking smiles and laughter, and so may 
have been conducive to further reading for 
immediate pleasure, but verbal rejection in 
the face of a larger “dose.” Such a possibil- 
ity is interesting and needs further re- 
searching; however, the possibility of 
unique short-term interactions between con- 
tent variables and reading strategies could 
pose a major, though not necessarily insur- 
mountable, challenge to the design and use 
of behaviorally based preference measures. 
It is important, however, to make a distinc- 
tion here. There may be some types of ma- 
terial which interact in unique ways with 
reading behavior in an experimental situa- 
tion, and these will of course raise special 
problems for the laboratory assessment: of 
reading preferences. Other types of mate- 
tial, however, may affect reading behavior 
in the laboratory in ways that are reflected 
in verbal choices when such choices commit 
the chooser to only a moderately small 
amount of further exposure, but not when 
they entail rather more extensive exposure, 
In the present study, it was made clear that 
the choices were with reference to prefer- 
ences for extended reading. If the instruc- 
tions had indicated that the choices were 
for materials, say, only a few pages M 
length, a different set of results might have 
emerged. From a methodological point o 
view, this means that when the subjects 
reading patterns in a laboratory setting 87 
evaluated against verbally expressed prei- 
erences, it should be made clear 88 o 
whether the preferences relate to short- 0? 
long-term reading. It may prove with fur- 
ther research that even greater specificity 18 
desirable. 

Exprrment IT 

In Experiment I there was a tendency d 

exploratory responses to center more on se? 
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ond than on first choices. Moreover, pilot 
subjects were often observed to direct more 
exploratory activity at alternatives ranked 
as second or third than at those ranked as 
first. It is possible that such concentrations 
of exploratory interest on alternatives 
ranked as second and/or third were unique 
to the particular subjects and alternatives 
sampled, but it is also possible that it was 
more generally a consequence of the proce- 
dure and instructions, Should the latter be 
the case, a more valid, and therefore useful, 
page-score measure will clearly depend on 
further changes in the methodology. By em- 
ploying a larger number of subjects and a 
broader range of alternative content, an ef- 
fort was therefore made in Experiment II to 
determine whether a peaking of the “choice 
function” at alternatives ranked as second 
or third may indeed be a general phenome- 
non associated with the particular proce- 
dures reported in this paper. 

Since (a) a larger number of subjects was 
tested, (b) a wider range of content varia- 
bles was employed for the alternatives, and 
(c) the subjects’ page turning was automat- 
ically monitored, the present study also al- 
lowed for a more broadly based assessment 
of the page score’s potential usefulness. 


Method 


Subjects. The subjects were 50 female under- 
graduate volunteers from the University of 
Dundee. The majority of the volunteers were 
from first- and second-year psychology classes. 

_ Stimulus materials. The stimulus materials con- 
sisted of 100 sentences selected from a group of 
short stories by twentieth century authors (eg., 
Emest Hemingway). The sentences were selected 
to represent a wide variety of behaviors, sources of 
frustration, emotional dimensions, etc., found in 
fictional materials (eg. physical aggression, 
Poverty, grief). Hach sentence was typed on a 
Separate sheet of heavy paper to form a page, and 
~ Pages were inserted in each of five loose-leaf 
binders. The combinations of sentences assigned 
to the binders were changed after the testing of 
every two subjects, so that 25 different sets of 
combinations were used in all. Assignment of the 
Sentences to the binders was governed by the 
Testriction that, across subjects, each sentence 
Appear in each of the five binders with equal 
frequency, 

Apparatus. The subject’s table was placed 
against a wall and at a right angle to the experi- 
Menter’s desk. The apparatus included a slanted 
Shelf, 152 meters in length, mounted with a row 
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of five switches and located 30.48 centimeters 
above and 20.32 centimeters back from the front 
edge of the subject’s table. Each of the five binders 
containing the collections of sentences was placed 
on top of one of the switches. An electric beam 
source was housed in a wooden compartment and 
affixed to one side of the table, and a photoelectric 
cell, also housed in a wooden compartment, was 
attached to the opposite side. The two were 
separated by a reading space of 71.12 centimeters. 

The beam source and cell were so positioned 
that the beam was broken whenever a page in a 
binder resting on the table was turned. The shelf 
switches and photoelectric switch were connected 
with six channels of a concealed sound-proofed 
event recorder. This arrangement allowed auto- 
matic monitoring of the alternatives examined by 
the subject, as well as the number of pages looked 
at in each, 

Procedure. The procedure was essentially the 
same as in Experiment I; the only departures were 
those made necessary by the use of the apparatus, 
Each subject was first acquainted with the opera- 
tion of the photoelectric beam and, with the help 
of a practice binder, shown how to place a binder 
on (or remove it from) the table without break- 
ing the beam. It was stressed that while a page was 
being read the binder should lie flat, and that 
skipping of pages, whether forward or backward, 
should be executed in a single, combined turn of 
the pages involved. Orientation to the apparatus 
was completed by allowing the subject a minute or 
two to familiarize herself with the binder-place- 
ment, page-turning, and replacement procedures. 
The practice binder was used for this purpose. 

The familiarization instructions were similar to 
those of Experiment I, except that the sentences 
were described as randomly ordered excerpts from 
five different novels. The choice instructions were 
also similar to those of Experiment I. 


Results and Discussion 

The mean page scores associated with the 
five verbal rankings are shown in the upper 
curve of Figure 2. The Newman-Keuls test 
indicated that (a) the mean score for sec- 
ond-choice alternatives was greater than 
those for all other alternatives, including 
the first choice; (b) the means for the first- 
and third-choice alternatives did not differ, 
but both were significantly greater than 
those for the fourth- and fifth-choice alter- 
natives; and (c) the means for the last two 
choices did not differ (p < .01 for all re- 
ported differences except that between the 
third- and fourth-choice means, in which p 
X .05). It is clear then that, once again, 
alternatives ranked as second in choice re- 
ceived more prechoice exploration than 
those ranked as first. A possible explanation 
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Fic. 2. Mean number of pages read as a func- 
tion of order of choice for Stages 1 and 2 combined 
("total") and for Stage 1 only (Experiment II). 


for this phenomenon, however, was sug- 
gested in the present study by the behavior 
of the subjects during the prechoice period: 
Following the choice instructions it ap- 
peared that the subjects proceeded in two 
distinct stages, the first involving a reexam- 
ination of each of the alternatives (Stage 
1) and the second, a return to a restricted 
number for further examination (Stage 2). 
It may be hypothesized that Stage 1, the 
reexamination of each alternative, served to 
provide further information, and since some 
alternatives elicited greater reading interest 
than others, choice evaluations were also 
being made, By the beginning of Stage 2, 
then, the subjects had narrowed the field 
down to their favorites and, because of the 
specific request to list the first two choices, 
were largely concerned with formulating 
their first and second rankings. This con- 
cern in turn may have resulted in an in- 
struction-specific exaggeration of the differ- 
ence between the page scores for the first 
two or three choices on the one hand, and 
those for the fourth and fifth on the other. 
The peaking of the choice function at the 
second choice, moreover, may have reflected 
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the relative ease of making a first choice 
and the relative difficulty of arriving at a 
second and final choice. Following this line 
of reasoning, then, it is possible that the 
nonmonotonie nature of the overall choice 
function was largely instruction-specific 
and that the choice behavior of Stage 1 may 
in fact have mirrored with greater validity 
the actual choice value of the five alterna- 
tives. 

The individual records were therefore ex- 
amined, and it was found that 47 of the 50: 
subjects indeed looked at each of the five 
alternatives—though not necessarily in 
strict left-to-right order—following the 
choice instructions, and of these 41 then. 
proceeded to return to several (median = 
2) of the more popular alternatives before: 
voicing their choices. Page scores based: 
solely on the first run-through of the alter- 
natives following the choice instructions: 
were therefore computed, and as may be 
seen in the lower part of Figure 2, the 
means did in fact describe a monotonic re- 
lationship with verbalized choice value. 
Moreover, when the relation between mean 
page score and order of choice was exam- 
ined by subjecting the 250 Stage-1 page 
scores (five alternatives x 50 subjects) to a 
single-factor analysis of variance with re- 
peated measures, order of choice was found 
to constitute a significant component of the 
variance (F = 4.89, df = 4/196, p < .01). 

From the point of view of relationship to 
verbal choice, the validating criterion, it is 
clear that the Stage 1 score may offer more 
promise than the overall score as a choice 
Measure. For reasons spelled out earlier, it 
is necessary to impose some limit on the 
number of choices requested in the instruc- 
tions, and as the number specified may in- 
fluence the subject’s choice strategy in ways 
similar to those suggested for the present 
study, it is clearly advisable to seek & 
measure as free from such influence as pos- 
sible. The results of this study suggest that 
the behavior of the subject during Stage 1 
may provide the source for just such 8 
measure. 

Although the primary purpose of the 
present study was methodological, it may 
be instructive to note briefly one of the find- 
ings emerging from a comparison of alter- 
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natives receiving large page scores (and fa- 
vorable rankings) with those receiving 
small scores (and unfavorable rankings). 
Alternatives containing sentences suggesting 
poverty elicited a high degree of interest 
when sentences making reference to fear 
were also present; however, alternatives 
contained the same poverty sentences in 
combination with sentences selected to ex- 
press a cheerful, carefree mood elicited very. 
little interest. When subjects receiving the 
former type of alternative were asked to 
give reasons for its popularity, responses 
were vague and included statements such as, 
"It deals with social problems." It is here 
that the advantages of extending the no- 
tions of exploratory behavior to the analy- 
sis of reading interest become apparent: 
When (a) prose materials are systemati- 
cally constructed in a manner analogous to 
the construction of visual and other stimu- 
lus complexes, (b) exploration of the mate- 
tials is monitored in the context of a choice 
situation, and (c) both exploratory scores 
and preference rankings are evaluated and 
compared, a clearer picture begins to 
emerge as to what factors, and combinations 
of factors, may govern the interest value of 
prose materials. Of equal importance, con- 
tent-specific discrepancies between explora- 
tory scores and preference rankings may 
yield information relating to such problems 
as short-term versus long-term reading in- 
terests, or reading interests which because 
of social or other pressures are not verbally 
expressed, 
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RECALL OF PLACE ON THE PAGE’ 


EUGENE B. ZECHMEISTER? anp JACK McKILLIP 
Loyola University of Chicago 


Two experiments that investigated the reliability of spatial retention 
for information recall from text material are reported. In Experiment 
I, 40 subjects read a 4,000-word passage and were asked to (a) answer 
20 fill-in questions, (b) rate confidence, and (c) indicate the place on 
the page of the correct fill-in answer. Text material was typed in four 
quadrants on a page, and subjects indicated spatial knowledge by 
checking a square corresponding to a specific corner on the page. Ex- 
periment II included a multiple-choice test following fill-in questions. 
In both experiments, spatial recall was highly reliable. Also, spatial 
retention was more likely for right than wrong fill-in answers. How- 
ever, spatial memory apparently did not affect confidence in item 
recall; nor, did a spatial attribute contribute to differential multiple- 
choice performance. Spatial memory was interpreted as reflecting a 
shift in the subject’s attribute hierarchy to a less dominant attribute 


of memory. 


Students occasionally report the vexing 
experience of being unable to recall an an- 
swer to an examination question, but of 
being able to remember exactly where the 
answer is located on a textbook page. Un- 
derwood (1969) suggested that these re- 
ports, and similar experiences, provide evi- 
dence for a spatial attribute of memory, al- 
though the reliability of these reports re- 
mains relatively undocumented. Further- 
more, the significance of a spatial attribute 
for information’ recall from text material 
has not been investigated. The present 
paper addresses itself to these problems. 

According to an attribute theory of mem- 
ory, a spatial attribute is considered pri- 
marily discriminative in function (Under- 
wood, 1969). That is, it is likely that spa- 
tial encoding permits differentiation be- 
tween memories such that less interference 
would be expected between memories of dis- 
tinct spatial disposition, In this respect, 
spatial recall may also serve as a highly 
effective retrieval system (Bower, 1970). 


* Results of Experiment I were reported at the 
meeting of the Midwestern Psychological Associa- 
tion, Detroit, May 1971. 

* Requests for reprints should be sent to Eugene 
B. Zechmeister, Department of Psychology, Loyola 
University, 6525 North Sheridan Road, Chicago, 
Illinois 60626. 


When retention is tested for textbook mate- 
rial, memory of place on the page may pro- 
vide a discriminative cue for information 
recall. 

Therefore, it seems reasonable to suggest 
that spatial recall is related to the strength 
of a memory experience. For example, given 
two memories for an event, one of which is 
"richer" in attributes, it might be expected 
that this difference will be reflected in the 
confidence a person has in his recollection 
of that event. Strength would be enhanced 
if additional discriminative cues provided 
better differentiation among similar mem- 
ory experiences. This interpretation sug- 
gests that correct information retrieval with 
correct spatial recall will lead to greater 
confidence in that memory than item recall 
in the absence of spatial knowledge. 

When item recall is not successful, a spà- 
tial attribute may mediate a subjects 
subjective belief that a memory has been 
established. Therefore, spatial recall may 
precipitate a feeling-of-knowing experience 
(Hart, 1965), or represent an aspect of re- 
tention precipitating a tip-of-the-tongue 
state (Brown & McNeill, 1966). In this sit- 
uation, a spatial attribute may facilitate 
performance when memory is tested by al- 
lowing subjects to choose the correct item 
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from among several alternatives on a recog- 
nition task. 

Two experiments that assessed the relia- 
bility of spatial recall for information from 
a lengthy prose passage are reported. In 
both studies, subjects were presented with 
study booklets and later required to answer 
very specific questions about the text and 
rate confidence. Then, the subject was 
asked to indicate in what corner of the 
booklet page the answer was to be found, 
regardless of whether correct item recall 
could be accomplished. Experiment II in- 
cluded a multiple-choice test following item 
recall. Recognition performance for items 
not recalled, but for which spatial recall 
was present, was compared to performance 
on items without spatial retention. Also, it 
was of interest that whether providing sub- 
jects with knowledge of the correct item’s 
location in the text would aid retention 
measured by either recall or recognition. 
Therefore, one-half of the subjects in Ex- 
periment II were provided spatial informa- 
tion for both fill-in and multiple-choice 
tests. 


ExPERIMENT I 


Method 


Materials, A lengthy prose passage (approxi- 
mately 4,000 words), containing biographical in- 
formation of two famous psychologists, was 
selected from E. G. Boring’s A History of Ezperi- 
mental Psychology (1950, pp. 508-517)* Four dif- 
ferent orders of the passage were typed in booklet 
form with the following characteristics: (a) pages 
Were divided into four “blocks” by typing, single 
Space, two columns of prose with triple spacing in 
the middle of each column; (b) material appeared 
1n 27 blocks; and (c) each study order began with 
the first block occupying a different quadrant of 
the first, page. Elite type was used with the result 
that the four typewritten blocks were each ap- 
proximately 3⁄4 X 4V4 inches on the page (about 
150 words). Order of reading was top left, bottom. 
left, top right, through bottom right. Therefore, 

e booklets were either seven or eight pages in 
length, depending on the beginning quadrant. The 
drst page also included a title for the material, 


“The authors gratefully acknowledge the per- 
mission to reproduce this text given by Appleton- 
Century-Crofts, Educational Division, Meridith 
Corporation. Special thanks are also due to Nell 

inch for her careful typing of the long passages 
ME pm particular restraints used in the present 

earch, 
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“William James and G. Stanley Hall,” that was 
arranged logically with the beginning of the text 
and that necessarily appeared as a cover sheet for 
one study order, 

Twenty fill-in-the-blank questions were pre- 
pared, based on the text material. Questions were 
taken from the middle 20 blocks only and required 
one- or two-word answers specific to a block. Word- 
ing as much as possible was maintained from the 
ien The following are two examples of the ques- 

Ons: 

Q. It was on an expedition to that 
James made one important discovery; he 
found that he was a philosopher. 

Q. In his psychology, basically James faced the 
the problem of conscious knowledge. This 
puts him partly in the tradition. 

In addition to each fill-in question, a 5-point con- 
fidence scale was provided. Finally, a box sec- 
tioned into four quadrants was placed on each 
question page. Two orders of the test booklet were 
prepared with the restriction that no two consecu- 
tive blocks be tested contiguously and that half 
of the fill-in questions in each test half were 
from the first 10 blocks and half from the last 10 
blocks. One test order was the reverse of the other. 

Procedure, Study booklets were distributed to 
the subjects with oral instructions that their 
memory would be tested over material from the 
passage. No time limit was imposed and subjects 
were instructed to read carefully, as if for a class 
exam, but to avoid going back over material once 
it was read. Furthermore, a blank sheet of paper 
was provided for the subject to record time when 
reading was completed; however, it was stressed 
that this was not important. 

Upon finishing the passage, the subject raised 
his hand and was given a test booklet with written 
instructions. The test instructions emphasized the 
three requirements of the retention test: (a) at- 
tempt to answer each fill-in question and guess 
when not sure; (b) indicate confidence in the fill-in 
answer on the 5-point scale; and (c) for every 
question, place an X in the square designating the 
appropriate location of the answer on the text 
page. Again, subjects were asked to record their 
time and to avoid referring back to test questions 
once the booklet page was turned. 

Subjects. All subjects were volunteers from 
upper-division psychology classes. Advanced 
undergraduates were used since pilot work sug- 
gested that the subject matter and vocabulary of 
the passage provided a difficult, reading task for 
introductory psychology students. Study-test con- 
ditions were randomized in blocks of eight (four 
study and two test orders) and assigned randomly 
to subjects in small groups. A total of 40 subjects 
participated. 


Results 


' Recall. Four study orders that differed 
only in the initial starting point for the text 
material were used. As expected, analysis 
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TABLE 1 
Frequency or Correct LTEM AND SPATIAL 
RECALL ror EACH or TWENTY 
FILL-IN QUESTIONS 


Recall Jal] s] e] oe] o] 
ltem | 31 24 36 14 5 28 17 20 3 22 
Spatial | 22 18 13 11 21 28 22 17 7 17 

|| 13] «| | 6] 27] 0] | 
Item 7 51129 1 8 0 19 22 29 
Spatial |19 19 20 8 13 16 14 19 28 22 


of variance of the number of correct fill-in 
responses confirmed no difference as a func- 
tion of beginning quadrant or test order 
(both Fs <1). The interaction between 
study and test orders was also not signifi- 
cant (F = 1.94, df = 1/32, p > .05). Mean 
correct item recall for the 20 fill-in ques- 
tions was 8.28 (SD = 2.94). Of the possible 
800 items, 41.37% were correct fill-in re- 
sponses. Also, the number of questions left 
blank was relatively few overall (X = 
2.88), indicating that subjects followed the 
instructions to guess. 

Mean spatial recall, as indexed by the 
number of correct quadrants checked by the 
subjects, was 8.88 (SD = 2.50). This result 
was highly significant when compared to 
that expected on the basis of chance re- 
sponding (£ = 9.79, df = 39, p « .001; 
chance = 5.00). Overall proportion of cor- 
rect choices was .44 and resulted in a z 
value of 12.65 for a proportion test based on 
the total number of instances. Therefore, 
recall of place on the page was a reliable 
phenomenon when the corner of the page 
was indicated. However, it should be noted 
that the absolute level of spatial retention, 
after correcting for chance, was relatively 
small when compared to that of item recall. 

Table 1 shows the frequency of correct 
item and spatial recall for each of the 20 
fill-in questions. It is apparent that the dif- 
ficulty of specific fill-in items was not con- 
sistently related to the frequency of spatial 
recall. For example, for Questions 3, 5, 11, 
and 14, the relationship is particularly in- 
verse. Questions 6 and 18 suggest the oppo- 


EUGENE B. ZECHMEISTER AND JACK MCKILLIP 


site. A Spearman rank correlation between — 
questions ordered on frequency of item and 
spatial recall was .28 (ns). 

However, spatial recall was more likely 
for right fill-in answers than for wrong 
items when individuals were considered. For 
each subject, spatial retention was com- 
pared for right and wrong items. A signifi- 
eantly greater proportion of spatial answers 
was obtained for correctly answered ques- 
tions (t = 2.76, df = 39, p < .01). The 
specific proportions were .52 and .39 for 
right and wrong recall, respectively. Both 
these figures were significantly above 
chance (t = 8.08, p < .01 for right items; ¢ 
= 6.08, p < .01 for wrong recall). 

Both item and spatial recall were exam- 
ined as a function of position on the page. 
Since all questions appeared equally at each 
of the four positions, an indication can be. 
had concerning the favorableness of any 
particular quadrant for recall. For each 
page, order of reading was from top-left 
through bottom-right blocks of text. The 
proportion of correct item recall for each of 
these quadrants was .23, .26, and .27, and 
.24, respectively. Obviously, no specific po- 
sition favored presentation of the informa- 
tion for item recall. aM 

Table 2 reports the proportion of spatial 
recall for each of the page positions when 
all responses were considered. Rows m 
Table 2 correspond to the item's true post- - 
tion, whereas column figures indicate the i 
proportion of instances the subject checked ^ 
each corner, given an item’s true position. - 
Therefore, the diagonal represents the pro- | 


TABLE 2 


Proportion OF SPATIAL RECALL FOR EACH OF 
Four CORNERS on THE PAGE 


Corner position (in order of reading) 
Sunt aa 


"True position 
1 2 3 4 

—ÓóÓ——— 4 | 
1. Top left 48 | 2 15 | -12 
2. Bottom left .26 | .44 | .20 | 2105 
3. Top right .16 .19 .56 

| 

4. Bottom right .18 .20 +33 k 
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portion of correct spatial choices for each of 
the four corners of the page. 

As seen in Table 2, the greatest propor- 
tion of spatial recall was made in the top- 
right corner (Position 8), and least recall 
was observed in the lower-right corner (Po- 
sition 4) of the page. Of particular interest 
(see Table 2) is the pattern of the subject’s 
responses that were errors in spatial judg- 
ment; that is, it might be expected that 
while a subject failed to know the exact 
corner of the page, he nevertheless retained 
some knowledge about its position relative 
to the top or bottom or left or right columns 
of prose. Table 2 confirms this assumption. 
In general, subjects tended to be wrong on 
the same side of the page more often than 
top or bottom. An exception to this state- 
ment occurred for Position 3, but this find- 
ing reflected the overall tendency not to 
choose Position 4. The proportion correct 
spatial recall for top-bottom placement was 
68 and .52, respectively; whereas, left-right 
recall was .71 and .64 for sides of the page. 

Confidence. Confidence ratings of the fill- 
in answers were analyzed for right and 
wrong responses for both presence and ab- 
sence of spatial knowledge. Table 3 shows 
the mean confidence ratings for each of 
these four categories of responses, as well as 
the overall averages. An inspection of these 
ratings suggests that confidence was slightly 
greater for fill-in responses accompanied by 
spatial recall. An analysis of variance for 
individual averages demonstrated that con- 
fidence was statistically different for right 
and wrong items (F = 138.64, df = 1/37, p 
< 001); however, confidence in item re- 
Sponses was not significantly affected by 
Spatial retention (F < 1). The interaction 
of item and spatial responses also was not 
Significant (F < 1). Apparently, spatial re- 
call did not mediate confidence in informa- 
tion recall. 

Reading time. Average reading time for 
all subjects was 23.09 minutes (SD = 5.02). 
Both item and spatial recall were examined 
by comparing results of the 20 fastest and 
slowest readers. Neither variable was sig- 
nificantly affected by reading time (p > 
05). A similar breakdown for test times 
Produced the same nonsignificance. Under 
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TABLE 3 
Mean Conripence RATINGS ror Eacu or Four 
Casses or RECALL 


Correct item Incorrect item 
Measure Spatial i 
pai Ove- Spatial Over 
Yes | No Yes | No | " 
Mean of indi- 
vidual aver- 
ages 4,16) 3.94) 4,05) 2.46] 2.43) 2.44 
Overall average] 4.18] 4.06] 4.12) 2.58] 2.53] 2.55 


self-paced conditions, differences in study 
time appeared to be relatively inconsequen- 
tial in affecting spatial retention. 


Experment II 


The results of Experiment I clearly indi- 
cated the reliability of a spatial attribute 
for information recall. However, the as- 
sumption that spatial memory would medi- 
ate confidence in item recall failed to be 
confirmed. Experiment II was undertaken 
with the purpose of investigating further 
the functional significance of spatial knowl- 
edge for verbal retention of prose material, 
Also, several changes were introduced to 
provide additional validity for the previous 
findings. Experiment II repeated the same 
general procedure of the first experiment 
with the following changes: (a) a new pas- 
sage, slightly shorter, was tested; (b) a dif- 
ferent population of subjects was sampled; 
(c) a multiple-choice test was administered 
after item and spatial recall; and (d) the 
confidence scale was extended to 7 points. 

One further manipulation was introduced. 
Half the subjects in Experiment II were 
provided correct spatial information during 
item recall and multiple-choice discrimina- 
tion. It was assumed that the presence of 
spatial knowledge would afford maximum 
opportunity for its use as a cue for reten- 


tion. 


Method 


Materials. Booklets were prepared in the same 
manner as in the previous experiment, although a 
different section of Boring’s History (1950, pp. 
517-524) was selected. In the opinion of the experi- 
menters this represented a somewhat easier pas- 
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sage to read and comprehend. Length of the pas- 
sage Was approximately 3,200 words, presented in 
21 blocks of text and on five pages. The same study 
and test conditions as in Experiment I were used, 
but only 16 questions were asked from the middle 
of the passage. Confidence ratings were made on & 
T-point scale, One-half of the experimental book- 
lets were prepared with the correct corner of the 
page indicated for each question. Also, multiple- 
choice questions with four alternatives were 
prepared, keeping the same stem as the fill-in ques- 
tions. Three distractors were chosen that appeared 
in the passage with approximately equal frequency 
as the correct answer and represented varying 
points of occurrence in the text. Only one order of 
ra multiple-choice questions was used in booklet 
orm. 

Procedure. All subjects were given written study 
instructions. After completing the passage the sub- 
jects were provided test booklets with the fill-in 
questions and either one of two conditions of test- 
ing. In one condition (C), test instructions in- 
formed the subject that information as to the 
location on the page of the correct answer was 
provided, and the subject was to try and use this 
information to aid recall. The other condition (N) 
was the same as in Experiment I. Upon completion 
of the self-paced recall, both groups were given 
multiple-choice questions in a new test order. 
Condition C received a booklet with the spatial 
format as it appeared in item recall. Condition N 
received only the 16 multiple-choice questions. The 
subjects were instructed to answer all multiple- 
choice items even if they had successfully com- 
pleted the fill-in questions. 

Subjects, All of the subjects were drawn from 
the introductory psychology class and participated 
as part of the course requirement. A total of 64 
subjects were randomly assigned to either Group 
C or N with an equal number of study-test condi- 
tions distributed randomly within each group. All 
subjects were tested in small groups. 


Results 


Recall. Number of correct fill-in responses 
was compared in a 2 X 4 X 2 analysis of 
variance where eueing, study, and test con- 
ditions represented the factors of interest. 
All obtained Fs were less than 1, demon- 
strating the equivalence of all these fac- 
tors in affecting recall. The failure to find 
any difference in overall recall between 
Groups N and C suggests that spatial infor- 
mation is irrelevant for correct item recall 
when compared to a group that must pro- 
vide its own spatial cue. Mean recall was 
5.50 and 5.94 for Conditions N and C, re- 
spectively. The overall low level of recall 
indicates that the passage and test were dif- 
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ficult tasks for the subjects. Range in cor- 
rect fill-in answers was 1-12. 

Retention of spatial information for Con- 
dition N was reliable (¢ = 6.41, df = 31, p 
< .001; chance = 4.00). An additional pro- 
portion test revealed a z of 9.69. Therefore, 
the validity of the spatial phenomenon was 
extended with a different passage and popu- 
lation. Average spatial recall was 6.97, or 
an overall proportion of .44 when all possi- 
ble instances were recorded. Interestingly, 
the absolute level of spatial recall was 
nearly identical to that obtained in Experi- 
ment I. 

When the proportion of correct spatial re- 
call for right items was compared to the 
same proportion for wrong responses, the 
differences were statistically significant (t 
= 5.74, df = 31, p < .01). These results 
agree with those of the previous experiment 
in showing a greater likelihood of spatial 
information accompanying correct item re- 
eall Both proportions were significantly 
greater than that which would be expected 
on a chance basis (t = 7.00 for right items; 
t = 4.19 for wrong recall). 

The 16 fill-in questions were rank ordered 
on frequency of spatial and item recall. The 
Spearman r between rankings was 32 (ns), 
and an inspection of the ranks revealed no 
consistent advantage for position recall as à 
function of difficulty of the fill-in item. 

An analysis of item and spatial recall for 
the four corners of the page was made in à 
manner similar to that reported in Experi- 
ment I. In order of reading, the proportion 
of item recall was .26, .28, .20, and .26, for 
the four page blocks, respectively. The re- 
sults for spatial recall are tabulated in 
Table 4. These proportions are remarkably 
congruent with those obtained for the same 
comparison in Experiment I. Again, sub- 
jects performed worst on the lower-ri 
quadrant, and they tended to make rela- 
tively more errors on the same side as the 
correct answer. When spatial retention for 
Condition N was compared for left-right 
and top-bottom orientation in the passage 
these proportions were .73-.70 and 66-52, 
respectively, for these two distinctions. , 4 

Confidence. Table 5 reports the indivic- 
ual mean confidence ratings for the various 
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$ TABLE 4 
PROPORTION OF SPATIAL RECALL FOR EACH OF 
Four CORNERS ON THE PAGE 


Corner postion (in order of reading) 
True position 


1 2 3 4 
1. Top left -50 .24 .19 -07 
2. Bottom left .25 End 17 Al 
3. Top right .14 .16 .50 


4. Bottom right .12 


classes of recall observed for Condition N. 
While the overall average suggests a slight 
increase in confidence as a function of spa- 
tial knowledge, the mean individual aver- 
ages do not support this trend. Due to the 
low overall level of recall for fill-in answers, 
and the fact that a number of subjects did 
not contribute equally to the averages 
based on individual means, the overall av- 
erage represents a more reliable index of 
confidence. An analysis of variance was 
performed on the 21 subjects who contrib- 
uted scores to each of the four classes of 
recall for Condition N. No effects reached 
significance other than average confidence 
associated with right and wrong recall (F = 
52.06). 

Mean individual confidence ratings for 
subjects in Condition C were 5.79 and 3.26 
for right and wrong recall, respectively. 
When these results were compared to the 
same averages in Condition N, irrespective 
of spatial retention, there was no significant 
difference in confidence due to the presence 
of spatial cues in Condition C (F < 1). 

Multiple choice. Two comparisons con- 
cerning multiple-choice discrimination re- 


TABLE 5 
Mean Conripence RaTINGS ror EAcH or Four 
CrassEs or RecaLL ror CowprTION N 


Correct item Incorrect item 


Measure 


Mean of indi- 
vidual aver- 
ages 5. 

Overall average| 5. 


sults are of major interest. Overall correct 
multiple-choice performance was first com- 
pared between Conditions N and C. The 
average multiple-choice correct for these 
conditions was 10.09 and 10.72, respectively 
(t = 1.03, df = 62, p > .05). Apparently, 
the presence of a spatial cue did not lead to 
finer discrimination on a subsequent multi- 
ple-choice test. To provide additional evi- 
dence on this point, mean proportion multi- 
ple-choice performance on wrong items for 
Condition C was compared with multiple- 
choice performance in Condition N for 
those wrong items that were not accompa- 
nied by correct spatial recall. Similar non- 
significant results were obtained (t = 1.16). 

Second, within Condition N, the mean 
proportion of correct multiple-choice in- 
stances for wrong items with and without 
spatial recall was .47 and .45, respectively. 
A dependent ¢ was .17 for those means. 
Knowledge of item location apparently did 
not facilitate later multiple-choice discrimi- 
nation when only wrong items were con- 
sidered. Finally, as in Experiment I, no dif- 
ferences were detected in spatial retention 
when subjects were classified according to 
reading time. Average reading time for all 
subjects in Condition N was 17.01 minutes. 


Discussion 


Results of both experiments have demon- 
strated the reliability of a spatial attribute 
of memory for verbal recall of text mate- 
rial. When subjects were asked to make 
precise judgments of item location from ho- 
mogeneous text, they did so significantly 
better than would be expected by chance. 
However, the absolute level of spatial re- 
tention was generally low, suggesting that 
spatial knowledge is not a dominant, attri- 
bute when recall is tested for prose material. 
To what extent level of spatial recall is de- 
pendent upon the particular task and mate- 
rial used is not clear. However, the ease or 
difficulty of specific test questions was not 
related to spatial retention. 

Further confirmation of subjects’ ability 
to remember item location in text material 
has been provided in a recent study carried 
out independently by Rothkopf (1970). In 
Rothkopf’s experiment, subjects were asked 
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to indicate the location of information, both 
within a 12-page passage (3,000 words) and 
within a partieular page. Within-page recall 
was measured by having subjects check the 
specifie eighth of the page in which an an- 
swer was to be found. Distribution of spa- 
tial judgments for both types of recall was 
significantly different from chance, predom- 
inantly in the direction of true spatial loca- 
tion. 

When subjects were considered, in both 
the Rothkopf (1970) and in the present 
case, spatial retention was more likely to 
accompany correct item recall than when 
information retrieval was not achieved. 
That the latter situation is more likely to 
be reported is probably due to the selective 
retention of subjects for these bothersome 
memory experiences. However, in the pres- 
ent experiments, the proportion of spatial 
recall accompanying both right and wrong 
items was significantly greater than chance. 

Nevertheless, the fact that probability of 
spatial recall is substantially increased 
when item recall is successful presents an 
additional problem of interpretation. Ob- 
viously, spatial recall is not a necessary 
correlate of item retrieval. Over two-thirds 
of the correct fill-in responses (allowing for 
chance) were not accompanied by the 
knowledge of the item’s location on the 
page. That spatial retention contributes to 
correct item recall follows from the nature 
of a discriminative cue posited by the at- 
tribute theory. However, the finding that 
spatial recall is greater for right than for 
wrong items is correlative evidence only. A 
possible explanation for the difference in 
spatial recall obtained for right and wrong 
fill-in answers is the potential cue value of 
the correct item for spatial recall, a sugges- 
tion also made by Rothkopf. However, it is 
certainly possible that a spatial attribute is 
a correlate of other attributes (e.g., con- 
text) that are part of correct information 
recall, 

Confidence ratings of fill-in responses 
were not differentially affected by the pres- 
ence or absence of spatial knowledge. Given 
the overall low level of spatial recall, and 
the fact that any determination of confi- 
dence included choices correct by chance, 
differences in confidence perhaps should not 
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be expected. On the other hand, it is possi- 
ble that both retrieval and confidence of 
verbal responses are mediated by more 
dominant attributes, specifically associative 
or semantic characteristics. A spatial at- 
tribute may suggest a retrieval cue when 
these more dominant attributes fail to re- 
cover an item from memory. Therefore, re- 
call of place on the page suggests a shift in 
the subject’s attribute hierarchy to a less 
dominant attribute. 

The secondary aspect of spatial recall for 
item retention is further suggested by the 
results of Experiment II. When spatial 
knowledge was provided to the subjects for 
both fill-in and multiple-choice items, nei- 
ther retention measure was affected by cue 
availability. The present procedure for the 
noncued condition (N) of Experiment II 
was similar to that used by Hart (1965) to 
investigate the feeling-of-knowing experi- 
ence. In Hart’s experiment, subjects were 
asked first to attempt fill-in answers and 
then to indicate their feeling of knowing for 
the correct answer for wrong items. These 
judgments were later related to correct mul- 
tiple-choice discrimination, with subjects 
performing significantly better on items for 
which a feeling of knowing was indicated. 

In Experiment II of the present study, 
subjects were asked to judge spatial location 
of fill-in answers before a subsequent recog- 
nition test. When multiple-choice discrimi- 
nation was examined for incorrect fill-in 
answers, no advantage was obtained for 
items whose location was known. This was 
true whether the experimenter provided 
the spatial cue or the subject generated 
spatial knowledge. Given the present task 
and material, spatial recall as a discrim- 
inative cue in multiple-choice perform- 
ance was not evidently facilitating. A feel- 
ing of knowing where an event was experi- 
enced is apparently not tantamount with a 
feeling-of-knowing experience based on pre- 
dicted recognition of that event. 

Finally, two problems remain unresolved. 
In both experiments it was quite apparent 
that spatial recall was affected by the cor- 
ner of the page that was tested; that 1s, 
subjects tended to make relatively fewer 
correct spatial choices for items located in 
the lower-right corner of the page. It is not 
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obvious which conditions contribute to this 
bias since item recall was unaffected by 
page position. Second, the partieular mech- 
anism that serves to enable spatial recall is 
not clear. Assuming that item retrieval is 
primarily associatively mediated, it is pos- 
sible that imagery of the text page provides 
a secondary attribute that carries spatial 
information. 
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GRADE EXPECTATIONS, DIFFERENTIAL TEACHER COMMENTS, 


AND STUDENT PERFORMANCE 
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This study investigated the effects of differential, written teacher 
comments on student performance, and the relationship of control of 
reinforcement orientation with the aforementioned variables. Exam 
papers of 87 CCNY undergraduates were assigned to one of three treat- 
ment groups: no comment (NC), specified comment (SP), or specified 
comment/grade expectation considered (SPGE). The Rotter I-E Scale 
was administered to the subjects 1 week later. Treatment effects were 
judged by performance on the next exam. Results indicated (a) that 
SP and SPGE subjects performed better than NC subjects (p < .05); 
(b) that SPGE subjects performed better than SP subjects (p < .005); 
and (c) a positive correlation between I-E scores and SPGE subjects’ 
performance on the second exam (p < 01). A 2 X 2 analysis of variance 
demonstrated an interaction effect (p < .01). The educational impli- 
cations discussed emphasized the efficacy of adapting feedback to the 


individual student. 


Grading is one of the most extensively 
applied methods in use in the field of educa- 
tion. Teachers ask students to perform 
tasks for grades, and all teachers spend 
some of their time grading examination pa- 
pers. One of the reasons this is done is so 
the teacher can evaluate the success of his 
instruction and the student can evaluate 
the success of his learning. It seems, how- 
ever, that the feedback of information may 
not be the sole purpose for this assessment 
process. The motivational state of the stu- 
dent may also be affected. The simplest mo- 
tivational element is avoidance of lack of 
interest that might result from no knowl- 
edge of results, but other elements are also 
present. A good mark may be considered a 
reward, whereas a poor mark may be consid- 
ered a punishment. Marks may be accom- 
panied by statements or comments so that 
together or singly they constitute a situa- 
tion of praise or blame. Some teachers con- 
sequently spend additional time writing 
comments on examination papers with the 
hope that increased subsequent performance 
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wil result. Whatever specific system 8 
teacher utilizes, in terms of grading and 
feedback, that system is based on his own 
judgment, which may be neither valid nor 
reliable. ! 

The effects of teacher comments investi- 
gated by Page (1958), Tyler (1958), Sas- 
senwrath and Garverick (1965), and 
Pickup and Anthony (1968) indicated the 
possible efficacy of teacher comments as & 
worthwhile instructional practice. The 
question as to what type of teacher com- 
ments are most effective certainly is not 
very clear. 

It seems logical to assume that when a 
student has completed a task for which he 
is to receive a grade, he forms an expecta- 
tion of some kind regarding that grade. Re- 
gardless of what procedure the teacher ad- 
heres to, it is unlikely that he considers the 
student’s subjective evaluation of his own 
performance (grade expectation) in meting 
out feedback. Furthermore, it is unlikely 
that the teacher considers the existence and 
possible influence of individual differences 
in regard to how susceptible each student 1s 
to be affected by teacher comments. The 
paucity of literature giving any considera- 


tion to individual difference variables in 
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this area is of concern to this investigator. 
More specifically, it is suggested that a be- 
lief in internal or external control of rein- 
forcement is just such a variable. Review 
papers (Lefcourt, 1966; Rotter, 1966) have 
shown that the internal-external control of 
reinforcement construct is a variable that 
can be measured, and which is predictive of 
behavior in a variety of circumstances. 


The internal-external control of rein- 


forcement construct (I-E) is rooted in social 
learning theory (Rotter, 1954). It is per- 
ceived to be a generalized expectancy, rele- 
vant to various situations, that relates to 
whether or not the individual feels he pos- 
sesses or lacks power to affect what happens 
to him. It is suggested that the I-E variable 
is of significance in understanding the ef- 
fects of teacher comments on student per- 
formance. 

This study tested the following two hy- 
potheses: 


l. Students who receive written teacher 
comments that incorporate their grade 
expectations perform significantly better 
on a subsequent examination than stu- 
dents who receive written teacher com- 
ments without regard to their grade ex- 
pectations. 


2. Under teacher comment conditions, a 
significant positive correlation exists be- 
tween student performance and control 
of reinforcement orientation. 


MeErHop 


Eighty-seven undergraduate students who regis- 
tered for a physics course at CCNY were the sub- 
Jects of this investigation. First, the instructor ad- 
ministered the objective exam that ordinarily came 
next in his course of instruction. Before handing in 
their exam papers, subjects were told by their in- 
structor that one of the instructors in the School of 
Education was trying to find out how well students 
Could predict their own grades on exams. Hach 
subject was then given a questionnaire which asked 
him to predict the grade he expected to receive on 
the exam. The instructor informed them that 
Participation was voluntary and would not affect 

eir grades on the exam. 

The instructor collected both the tests and the 
questionnaires, and the latter were given to the 
experimenter. The instructor proceeded to mark 
the. exams in his manner so that each paper ex- 
hibited a numerical score, following his usual 
Policy of grade distribution. The papers were then 
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given to the experimenter who ranked them ac- 
cording to numerical order within the class, with 
the best paper on top. The first three subjects 
(papers) were assigned to one of the three treat- 
ment groups in a random fashion, then the second 
three subjects, etc. The three treatment groups 
were: (a) NC—no comment treatment subjects 
received only the grade given by the instructor on 
their examination papers. (b) SP—specified com- 
ment treatment subjects received comments des- 
ignated in advance for each letter grade in addi- 
tion to the grade given by the instructor (ie., A, 
excellent; B, good). (c) SPGE—specified com- 
ment/grade expectation considered treatment sub- 
jects received comments designated in advance for 
that combination of grade expectation and grade 
received appropriate for the individual case. For 
example, a student who expected an A and 
received a C also received the comment, “OK, but 
Ireally expect that you can do much better," while 
a student who expected a C and received an A also 
teceived the comment “Excellent. I’m pleasantly 
surprised, Keep it up.” 

The effects of the three treatments were judged 
by the scores achieved on the next exam, given 
in the normal course of study. The Rotter I-E 
Scale (1966) was administered to subjects 1 week 
later. 


RESULTS 


Examination of Table 1 indicates that a 
difference existed between the SP group 
(M = 70.4, SD = 17.0) and the SPGE 
group (M = 78.3, SD = 14.8) with respect 
to scores on the subsequent exam. This dif- 
ference was found to be significant (p < 
.005), thus supporting the first hypothesis. 

The correlation coefficients demonstrating 
the relationship between the treatment 
groups and I-E Scale scores were computed 
for each of the groups utilizing the subse- 
quent test scores, The correlations are pre- 
sented in Table 2. 


TABLE 1 


Comparison BETWEEN TrST PERFORMANCE OF 
Sprorrrep Comment (SP) AND Sprcrrrmp 
Comment/Grape EXPECTATION 
Constpzrep (SPGE) Groups 


SE 

Comparison: | PENS | Fence |o! | y 
Between SP 

and SPGE | 7.90 2.66 2.97 | <.005 


Note.—Probability quoted assumes that one- 
tailed test was appropriate. 
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TABLE 2 


CORRELATIONS BETWEEN SECOND-TEST 
PERFORMANCE AND I-E SCALE SCORES 


Group 
Scale 
NC SP SPGE 
Internal-External | —.32* +23 Art 


Note.—Abbreviations: NC = no comment, 
SP = specified comment; SPGE = specified com- 
ment/grade expectation considered. N = 29 in 


Examination of Table 2 indicates that 
two of the three treatment groups are sig- 
nificantly correlated with the control of re- 
inforcement variable. Examination of the 
data shows that NC subjects who received 
high scores on the I-E Scale (external-con- 
trol subjects) tended to perform poorer on 
the subsequent exam than subjects in the 
same treatment group who received low 
scores on the I-E Scale (internal-control 
subjects). The significant positive correla- 
tion (p < .01) between external control of 
reinforcement and subjects in the SPGE 
group seems to support the second hypothe- 
sis. The correlation, however, between the 
SP group and control of reinforcement was 
not significant. 

A 2 X 2 analysis of variance was per- 
formed to clarify and expand the meaning 
of the significant correlations on the second 
test between I-E Scale scores and the NC 
and SPGE treatment groups. Differences 
attributable to main effects (control orien- 
tation; treatments: NC, SPGE) were not 
significant. A significant interaction effect 
was manifest (F = 15.48, p < .01). Scheffé 
comparisons indicated that (a) external 
SPGE subjects performed better than inter- 
nal SPGE subjects (p « 01); (b) internal 
NO subjects performed better than external 
NC subjects (p < .05); (c) external SPGE 
subjects performed better than external NC 
subjects (p « .01); and (d) no significant 
difference was manifest between the inter- 
nal NC subjects and the internal SPGE 
subjects. 
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Discussion i 


The results of this study become more 
intriguing when compared with the findings 
of Page (1958) who found a significant dif- 
ference between his no comment and speci- — 
fied comment groups (p < .05). A corre: - 
lated t test indicated that no significant dif- 
ference existed in the present study between 
NC and SP group subjects. One possible ex- 
planation for this difference in findings may 
be the length of the specified comments, 
Page utilized longer comments than the 
present study for all specified comment cat- 
egories. Longer comments, albeit hackneyed 
labels, might indicate to the student greater 
concern on the part of the instructor than 
similar comments that are shorter. Another 
factor which should be recognized is the age 
difference in the subject populations of the 
two studies. The subjects of the Page study 
were in secondary school, while the present 
investigation employed college students. — 
Perhaps the greater exposure of the latter < 
group to standard comments negatively in- 
fluenced the possible motivational effects of 
such comments, 

Page failed to find any significant differ- 
ence between his two comment groups, 
whereas in the present investigation a sig- 
nificant difference was manifest between the 
two comment conditions. Since the specified 
comment group was employed in both stud- 
les, an investigation of the free comment 
and specified comment/grade expectation 
groups appears to be in order. In the Page 
study, teachers were instructed to write 
anything that occurred to them in the cir- — 
cumstances. There was not any right or - 
wrong comment for the study. A comment 
was right if it conformed with the teacher's 
own feelings and practices. It is not surpris- 
ing that this carte blanche treatment did 
not differ significantly from the specified 
comment group. Indeed, it is possible that 
some subjects in this group received either 
no comment or a standard comment (ie; 
specified comment) while others might have 
received extensive informational as well 88 
affective remarks. At any rate, the lack of 
structured content would seem to render the 
meaningful replication of this treatment 
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group impossible. The present study utilized 
the concept of individualized comments, in- 
herent in the free comment group, but 
structured and limited the content of these 
comments. Apparently the incorporation of 
students' grade expectations into comments, 
specified for each performance and expecta- 
tion level, was a significantly more effective 
technique. 

A most important finding of this investi- 
gation is the correlation between control of 
reinforcement orientation and "Treatment 
Groups NC and SPGE. Internal students 
who did not receive comments tended to 
Score higher on the second examination 
than external students in the same treat- 
ment group. It seems logical that external 
students would not achieve as well as inter- 
nal students when the feedback they re- 
ceived (a percentage score without any 
comment) did not in any way cause them 
to reflect upon their external control orien- 
tation. What is being suggested here is that 
teacher comments can have a motivating 
effect, not only because they are a form of 
positive reinforcement, but also because 
they present an indirect challenge to the 
student maintaining an external control ori- 
entation. When an external student reads 
the comment, “I expect that you can do 
even better” for example, it seems alto- 
gether likely that he realizes that this situa- 
tion is one in which his behavior (i.e., stud- 
ying more) can make a difference. Whether 
or not he is consciously aware of this process 
or is willing to acknowledge it is unimpor- 
tant. Obviously, maintaining an external 
orientation allows one to defensively ac- 
count for poor performance while yet main- 
taining self-esteem. This form of rationali- 
zation may actually be an effective modus 
Operandi enabling students to continue to 
strive academically, meet with failure, and 
still maintain a positive self-regard. This 
explanation can account for the correlation 
between I-E and the NC group as well as 
the even greater correlation between I-E 
and the SPGE group. Furthermore, it is al- 
together possible that students in the NC 
group were aware that other students in the 
class did receive various comments on their 
test papers. Assuming this to be true, the 
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significant negative correlation between I-E 
m the NC group is readily understanda- 

e. 

Tt was expected that there also would be 
2 significant positive correlation between 
externality and the SP group. The lack of 
any significant correlation can best be ex- 
plained by the lack of affective content and 
overusage of remarks such as excellent or 
good. By the time a student reaches college, 
he has probably seen these kinds of imper- 
sonal comments for many years and has be- 
come desensitized to their effects. The ques- 
tion of why the internal students do not 
maintain their superiority as was manifest 
under the NC condition seems relevant 
here. Internals may not only be desensitized 
by these hackneyed labels, but may also 
feel that they are being deprived of some of 
their control of the environment; that is, 
their own ability to label their objective 
grades. 

Implications for Education 

The nature of feedback provided in the 
learning process has been known to vary 
widely. The present investigation has dem- 
onstrated the value of written teacher com- 
ments that incorporate students’ grade ex- 
pectations. Furthermore, it has shown that 
students who maintain an external control 
of reinforcement orientation are more apt to 
be influenced by these comments. 

The comments involved in the SPGE 
condition incorporated the grade expecta- 
tion of the student, and involved empathic 
communication. For example, a student who 
expected a C and received an A also re- 
ceived the comment “Excellent. I’m pleas- 
antly surprised. Keep it up.” This comment 
acknowledges that he will have an affective 
reaction to the grade and attempts to com- 
municate both understanding and accept- 
ance of that reaction. The fact that it in- 
cludes a motivational aspect should also be 
recognized. 

Teachers clearly have been told by edu- 
cators that students learn at different rates 
in accordance with individual differences. 
How they are to acknowledge these individ- 
ual differences in their classroom teaching 
behavior, however, has not been altogether 
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clear. Some teachers attempt to adapt feed- 
back to the individual student on an intui- 
tive basis, utilizing past experiences and 
impressions of the student. It should be rec- 
ognized that more valid adaptations are 
possible. This study would seem to suggest 
that teachers can take account of the pre- 
dispositions of the learner, in terms of both 
his I-E orientation and his grade expecta- 
tion while rendering feedback, by simply 
writing the appropriate comments on his 
examination paper. That this has an effect 
on future performance has been demon- 
strated in this investigation. 

Teachers who render feedback in the 
form of written comments, solely on the 
basis of past performance, are ignoring the 
existence of grade expectation and personal 
style (i.e., control of reinforcement orienta- 
tion). It is recognized by the present author 
that the scope of this study can be broad- 
ened greatly. If the incorporation of grade 
expectations into teacher comments can 
have a significant effect on the performance 
of students, one can only speculate as to the 
possible effect of comments that incorporate 
other personality variables as well as situa- 
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tional variables. What is being suggested is 
not only an individualized approach to 
feedback but a personalized one as well. 
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To determine the effect of student characteristics and student control 
on learning, three experimental variables (college aptitude, inquisitive- 
ness, and student control) were combined in a 2 X 2 X 4 factorial 
arrangement. Videotape recording facilities were used to simulate a 
learner-computer environment, in which 192 college students were 
given degrees of control over the programming of their own learning. 
As expected, high-aptitude-high-inquiry subjects learned significantly 
more under a high degree of student control, and high-aptitude-low- 


inquiry subjects learned significantly more under a low degree of 
student control. Results for low-aptitude subjects were inconclusive. 
Overall, subjects learning under a high degree of student control 
learned the least. However, they formed the most favorable attitude 


toward the method of instruction. 


Recently, psychologists and educators 
have become advocates of methods of in- 
struction in which the learner is given far 
greater self-direction or control over his 
own learning. However, there is only lim- 
ited and questionable evidence that self- 
direction is a motivating, satisfying, or an 
effective mode of interacting within a learn- 
ing environment. Furthermore, other evi- 
dence suggests that any single mode of in- 
struction will interact with specific learner 
characteristics (Leith, 1970; Majer, 1969; 
Tallmadge & Shearer, 1969, 1971). Thus, it 
Seems reasonable that at least some student 
"types" will learn less and find less satis- 
faction under a single specific instructional 
mode, be it student controlled or instructor 
controlled. In fact, a few current learning 
Systems operate under the assumption that 
each student should be provided with the 
mode of instruction which is best suited or 
"adaptive" to his cognitive style, aptitude, 
interests, personality characteristics, ete 
(Gallagher, 1970; Rigney & Towne, 1970; 
Taylor, Montague, & Hauke, 1970). 

." This study was based on a doctoral disserta- 
tion submitted to the graduate school of Michigan 
tate University. The author is indebted to 
Robert H. Davis and Lawrence T. Alexander for 
their advice and guidance. 

* Requests for reprints should be sent to John P. 

Y, who is now at HumRRO, P. O. Box 6057, Ft. 
Bliss, Texas 79916. 


Of the few studies dealing with student- 
controlled instruction, Mager (1961), 
Mager and McCann (1961), Mager and 
Clark (1963), and Grubb and Selfridge 
(1964) have produced striking results, and 
claim that student control increases learn- 
ing effectiveness. However, they either 
failed to use control groups or they included 
other variables that may have accounted 
for the observed differences. On the other 
hand, of the two investigators who have 
done controlled experimentation in this 
field, Gallagher (1970) found no significant 
differences and Campbell (1964) found re- 
sults that were equivocal. 

The present research sought to provide 
the learner with a learning environment 
that was less restrictive and more respon- 
sive to learner control than that formulated 
by Campbell or Gallagher; yet not as per- 
missive as the environment introduced by 
Mager (1961), which precluded experimen- 
tal control. 

In preparing a review of research on the 
sequencing of instruction, Briggs (1968) 
found it convenient to classify research on a 
learner/instructor control continuum. At 
one end of such a continuum is the situation 
where the instructor arbitrarily lectures 


from an arbitrarily prepared text, insensi- 


tive to the learner’s presence. At the other 
extreme, it is possible for the learner to re- 
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late to the instructor and material on an 
individual basis because he can demand 
Írom the instructor, at will, any informa- 
tion, examples, reviews, or feedback in an 
order convenient to his personal learning 
needs. 

An attempt was made not only to simu- 
late these two extreme instructional envi- 
ronments but also to distinguish between 
learners who should thrive in each. In other 
words, it was hypothesized that whenever a 
match between the source of instructional 
control and student learning style occurred, 
superior learning and satisfaction would re- 
sult. Likewise, a mismatch should result in 
inferior performance and dissatisfaction 
with the learning environment. 


METHOD 
Subjects 


The subjects were 192 volunteers from intro- 
ductory psychology courses taught at Michigan 
State University during the fall term, 1969, and 
winter term, 1970. Most of the subjects were 
freshman (N = 157) and female (N = 119). 


Design 


The three-factor experimental design included 
four levels of instructional treatment, two “in- 
quiry” levels, and two aptitude levels. The four 
instructional treatments were : 

Student-controlled instruction (SCI) treatment. 
Under this treatment, each subject was given a 
freshly shuffled deck of cards. Each card in the 
deck contained one significant question about 
computers to which the subject might wish to 
know the answer. For each card there was a cor- 
responding videotape segment, that, in effect, 
answered the question on that card. Therefore, 
each subject could decide the sequence in which 
he wanted the questions answered. The experi- 
menter, in another room, located the appropriate 
videotape segments and played them to the sub- 
ject over a television monitor. Subjects could view 
again and/or never view up to 4 of the total of 52 
segments. 

In addition, these subjects were given the oppor- 
tunity to ask questions about each tape segment 
after it had been shown. If asked, the experimenter 
would appear on the monitor and answer each 
inquiry posed to him. All questions and answers 
were recorded on videotape for later use with sub- 
ject groups described below. 

Finally, subjects in this group, as well as those 
below, were allowed to proceed at their own pace 
and to take as many notes as needed. 

Expert treatment. These subjects viewed the 
videotape segments in a sequence predetermined 
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by a consensus of six computer-science instructors, 
The sequence to be followed was presented to 
these subjects as a list of questions, identical to 
those on the SCI cards above. Furthermore, each 
subject in this group was yoked with one of the 
subjects under the SCI treatment. Therefore, any 
repetition or deletion of segments by the SCI sub- 
ject was duplicated for his counterpart subject in 
this group. Subjects under the expert treatment 
were not allowed to ask questions and had no 
recourse but to view the recorded questions (and 
answers) asked by his SCI counterpart. These 
recordings followed directly after the appropriate 
tape segment had been displayed. 

Random treatment. These subjects viewed the 
videotape segments in a completely random 
sequence, The sequence to be followed was pre- 
sented as a list of questions, as above. As with 
expert subjects, each subject in this group was 
yoked with one of the SCI subjects and heard only 
his questions. 

Control treatment. These subjects received no 
instruction or information with regard to com- 
puters. 

Subject-types. It was hypothesized that only 
certain types of subjects would be able to take full 
advantage of a student-controlled learning en- 
vironment. Therefore, all subjects were initially 
Separated into groups of subject-types who were, 
respectively, high on both inquiry and aptitude 
predictor variables, low on both of these dimen- 
sions, and two groups for which the dimensions 
were incongruent. 

The inquiry dimension was derived from a 
battery of six tests which Shulman, Loupe, and 
Piper (1968) found to account for 50% of the vari- 
ance between effective and ineffective inquirers. 
Scores on the six tests were standardized and 
summed. High total scores reflected. individuals 
high in cognitive complexity, preferring the am- 
biguous, the assymetrical, and unexpected to the 
regular, articulated, and predictable; liberal in 
political values; high in associational fluency; high 
in nonstereopathy ; high in verbal problem solving; 
and low in expressed test anxiety. 

The aptitude dimension was derived from 
standardized scores on college aptitude tests (ACT, 
CQT, or SAT) routinely administered to all enter- 
ing students. This distribution of scores was 
divided at the median as was the distribution of 
inquiry scores for the same subjects. , 

From each cell of the resultant four-cell matrix, 
sets of four subjects each were randomly drawn. 
Then each subject of each set was assigned ran- 
domly to one of the four instructional treatments. 
Eventually, 12 subjects filled each cell of the 
2 X 2 X 4 factorial design matrix. 


Experimental Materials 

The content, “Computers and How They 
Work,” was not only of interest to college 
students and sufficiently complex, but it was also 
capable of being learned in a nonhierarchic 
manner, That is, it allowed the learner to beg!? 
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, at any point and follow any sequence with im- 

punity. Although Gagné (1965) and Campbell 
(1964) have suggested that material which is 
hierarchical and of a problem-solving nature offers 
the best opportunity for student self-direction to 
function effectively, it was hoped that under the 
SCI instructional treatment, the learner would 
make optimal use of his unique cognitive struc- 
ture to make the material “meaningful.” 

Performance was measured by three equivalent 
forms of an achievement test. Each form contained 
50 multiple-choice items. Reliability averaged 80 
(KR-20). The presentation sequence of test forms 
(pretest, posttest, and retention test) was sys- 
tematically alternated among all subjects; how- 
ever, each set of four yoked subjects received the 
same identical sequence. The two learning sessions 
of an hour each, on consecutive days, were fol- 
lowed 2 weeks later by the retention test. 

Satisfaction with the learning environment was 
measured by the six following items: 


My college courses should be taught this way. 
I found I had difficulty paying attention. 

T expect to retain the material covered. 

I would recommend this experience to & 
friend. 

I felt that in this environment, I was in con- 
trol of my learning. 

. The features of this learning environment in- 
creased my motivation to learn in it. 


fe co tore 


os 


RESULTS 


Table 1 presents cell means and standard 
deviations for each of three dependent 
measures. There are 12 subjects in each cell. 
The results of three 2 x 2 x 4 analyses of 
variance are shown in Table 2. Since the 
SCI treatment was of primary interest, its 
mean has been contrasted with the other 
three treatment means in all three analyses. 


Gain 1 (Posttest minus Pretest) and Gain 2 
(Retention minus Pretest) 


For both Gain 1 and Gain 2 measures 
there were significant differences between 
means for each of the three factors of the 
experimental design. 

Aptitude. Although it might be expected 
that higher college aptitude subjects would 
gain more knowledge, crude gain scores are 
usually uncorrelated with measures of intel- 
ligence. The fact that the material was dif- 
ficult to learn may have served to accen- 
tuate aptitude differences. 

Inquiry. Although the more inquisitive 
subjects gained significantly more knowl- 
edge of computers, the moderate correlation 
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(r = .87) between inquiry and college apti- 
tude dimensions probably accounts for some 
of the observed differences. 

Instructional treatments. Overall, sub- 
jects learning under expert and random 
treatments learned significantly more than 
SCI treated subjects. These results indicate 
that in this short learning experience, the 
more familiar the instructional treatment, 
the greater the learning. 

Interactions. There were no overall sig- 
nificant interactions. 

Analyses within subject-type groups. In 
line with the hypothesis of individual dif- 
ferences being related to learning, subject- 
type groups were broken out of the overall 
design and analyzed separately. Again, the 
SCI treatment was contrasted with the 
other three treatment means. Table 3 pres- 
ents analyses of variance for each subject- 
type group. i 

Among high-aptitude-high-inquiry sub- 
ject-types, SCI treated subjects gained sig- 
nificantly more knowledge than any of the 
other three treatment groups on both gain 
measures. However, the observed differences 
were not as great as expected. 

Among high-aptitude-low-inquiry sub- 
ject-types, expert-treated subjects, as pre- 
dicted, were far superior to the other 
groups. In fact, on both gain measures, 
these subjects demonstrated the greatest 
achievement of all subject-type groups in 
the experiment. 

Among low-aptitude-high-inquiry sub- 
ject-types, SCI-treated subjects learned the 
least, contrary to expectations. Apparently, 
since the material to be learned was new 
and difficult, being inquisitive hindered 
rather than helped these low-aptitude learn- 


ers. 

Among low-aptitude-low-inquiry sub- 
ject-types, all subjects demonstrated so lit- 
tle achievement, that the results are difficult 
to interpret. Random-treated subjects did 
the best as measured by the Gain 1, but did 
not retain well what they had learned. 

After deleting control subjects, two 3 x 4 
factorial analyses of variance were per- 
formed to test the predicted interaction be- 
tween subject-types and instructional treat- 
ments. Overall, both Gain 1 and Gain 2 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS OF GAIN 1, GAIN 2, AND ATTITUDE SCORES 


Instructional treatment 
Subject-type Total 


SCI | Expert Random Control : 
ML uec cL au T6 
Gain 1 scores 
High aptitude-high inquiry 
M 23.13 21.14 22.79 2.35 17.35 
SD 7.79 4.98 8.64 6.21 7.11 
High aptitude-low inquiry 
M 17.58 24.57 16.19 1.34 14.92 
SD 9.31 7.65 6.89 4.98 7.94 
Low aptitude-high inquiry 
M 13.08 15.40 13.80 —.8l 10.37 
SD 8.08 8.98 9.20 5.62 8.17 
Low aptitude-low inquiry \ 
M 10.45 9.05 13.36 —.99 7.97 
SD 8.54 9.20 7.60 4.76 7.64 
Total 
M 16.06 17.54 16.54 7 12.65 
SD 8.60 7.88 8.13 5.21 11.02 
eee A ee NP pel NM ERN. ee 
Gain 2 scores 
posso inmetens ti ctio ctn afia cn gol scenic NP TNR RNC adia 
High aptitude-high inquiry 
M 14.78 14.13 14.08 1.11 11.08 
SD 8.50 4.47 6.37 4.72 6.08 
High aptitude-low inquiry 
M 9.44 15.38 7.57 .00 8.10 
SD 6.58 7.25 5.28 3.96 6.13 
Low aptitude-high inquiry 
M 2.92 9.18 8.76 Ma 5.24 
SD 5.46 9.99 5.45 5.41 6.87 
Low aptitude-low inquiry 
M 4.84 4.19 2.90 —2.78 2.29 
SD 8.75 6.37 7.23 4.38 6.80 
Total 
M 8.00 10.72 8.33 —.89 6.66 
SD 7.45 7.30 6.13 4.64 8.44 
Attitude toward method of instruction 
High aptitude-high inquiry 
M 22.33 19.25 17.75 17.33 19.17 
a QNS 3.45 3.96 3.62 3.01 3.4 
High aptitude-low inquiry 
M 21.83 18.17 17.17 17.75 18.73 
D 5.41 4.17 5.11 2.63 4.47 
Low aptitude-high inquiry 
M 19.83 20.00 19.58 16.83 19.06 
dar Spas 5.08 4.41 4.48 2.15 4.22 
Low aptitude-low inquiry 
M 22.33 20.25 17.00 18.33 19.48 
SD 3.65 2.42 4.29 2.34 3.34 
Total 
M 21.58 19.42 17.88 17.56 19.11 
SD 4.48 3.82 4.41 2.54 4.15 
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TABLE 2 
SUMMARIES OF ANALYSES OF VARIANCE FOR GAIN l, GAIN 2, AND ÁTTITUDE MEASURES 
Gain 1 Gain 2 Attitude 
Source df 
MS F MS F MS F 
Aptitude (A) T 2,330.4 40.6*** | 1,612.4 | 38.4*** 5.0 3 
Inquiry (B) 1 280.6 4.9* 415.4 | 9.9** 0 .0 
Treatment (C) 
SCI expert 1 1,529.1 20.7*** 1,053.5 25 1S 6.0 E 
SCI random 1 2,188.5 38.2*** 654.9 15.6*** 92.3 6.19" 
SCI control 1 5,829.7 101.6*** | 1,687.6 | 40.2*** | 388.0 | 25.6*** 
AXB 1 0 r .0 0 8.8 6 
AXC 3 139.8 2.4 108.1 2.6 13.1 .9 
BXC 3 33.0 s 56.4 1.3 18.4 1.2 
AXBXC 3 142.7 2.5 95.5 2.3 13.0 9 
Error 176 57.4 42.0 15.1 
*p< .05. 
Eme i. 
Ene 001. 


yielded results that were not significant (F 
= 1.99, df = 6/132, p < .07; F = 1.71, df 
= 6/132, p < .12). However, Figure 1 illus- 
trates that for Gain 1 scores the predicted 


Attitude toward Method of Instruction 
As shown in Table 2, neither the degree 

of subject aptitude nor the degree of subject 

inquisitiveness had an effect on attitude to- 


ward method of instruction. The latter re- 


interaction did occur, at least for high-apti- 
sult was not expected. 


tude subjects. 
TABLE 3 


SUMMARIES OF ANALYSES OF VARIANCE OF GAIN 1 AND GAIN 2 FOR Eacu Susszct-Typx Group 


Subject-type group 


Low aptitude- 
low inquiry 


Source df 


Gain 1 
SCI expert 1 229.8 | 4.6" | 1,480.3 |u7.7t**| 405.0 | 6.1% | 18.8 3 
SCI random l | 57 16.0%] 3627 | 6.7% | 469.7 | 7.20" | 505.7 10-275. 
SCI control | | 9,880.6 aLa 1,581.1 [29:49] 1,157.9 |17.4***) 785.5 18.5 
Error 44 5 54.0 66.6 E 


Gain 2 

SCI expert 1 153.8 | 4.1* 849.2 |22.609*| 248.3 | 5.3% us 1s 
SCI random 1 3x4 | sit | 64.8 | 1.7 495 8.0 | 28.0) 6 
SCI control 1 | 1,122.0 }30.3**) 534.9 1&2 | — 470 1. 48.8 | 7. 
f Error 44 36.8 37.5 47.0 1 

*p« 05. 

hp < OL. 

** p< 001. 


464 


JOHN P. FRY 


28 
26 
a 
24 Vit v 
/ \ HA/HI 
9? N 
22 Meu ie 
Z \ 
\ 
ul 20 y N 
x 4 N 
o 7 \HA/LI 
$79 Poca \ 
= H-HIGH A-APTITUDE X 
a - = . 
$ 16 L-LOW  I-INQUIRY 
z LA/HI 
M 
z 


SCI 


EXPERT RANDOM 


INSTRUCTIONAL TREATMENT 
Fra. 1. Mean Gain 1 scores as a function of subject-type group and instructional treatment. 


Overall, SCI-treated subjects developed a 
more positive attitude toward the method 
of instruction than both the random and 
control subjects. Although SCI subjects’ at- 
titudes were also more favorable than those 
of the expert subjects, the difference was 
not statistically significant. 

Among subject-type groups, the only 
noteworthy result was the sharp drop in at- 
titude of the low-aptitude-high-inquiry SCI 


subjects. Their attitude toward the method 
of instruction reflected their performance. 
Discussion 

There are at least two reasons why SCI- 
treated subjects did not learn more than 
they did. First, they spent over 2 oar 
learning, yet asked on the average ony a 
questions apiece. This suggests that d 
learning environment, as designed, was St! 
too structured. Second, of the questions 


| 
| 
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asked, few could be called review or selí- 
testing type. This suggests that lack of ex- 
perience with self-directed learning envi- 
ronments prevented these learners from 
making full use of self-direction opportuni- 
ties. The fact that Campbell (1964) had to 
train subjects in self-direction to obtain sig- 
nificant differences in learning corroborates 
this view. 

Although subjects exposed to the random 
instructional treatment were the least 
pleased with the method of instruction, 
their overall learning performance was 
comparable to other subjects. Such findings 
confirm similar results with programmed 
material (Briggs, 1968; Brown, 1970). An- 
ecdotal evidence suggests that these sub- 
jects, especially the more inquisitive ones, 
regarded learning under a random sequence 
as a challenge. 

The results as a whole support the hy- 
pothesis that “learning styles” or individual 
differences are related to learner/instructor 
control of instruction (at least for high-ap- 
titude students). Thus, if a student were 
high in both aptitude and inquisitiveness, 
he should be assigned to an SCI-type in- 
structional treatment. Otherwise, he should 
be assigned to an expert-type instruction. 

Such prescriptions are premature, to say 
the least. Only one type of subject matter is 
represented here and only a few subject 
types and treatments have been tested. 
Thus, far more research is required before 
learning styles are likely to have any prac- 
tical impact on efforts to completely indi- 
Vidualize instruction. In fact, Tallmadge 
and Shearer (1971) have recently expressed 
discouragement with the inconsistent find- 
ings of their exploratory studies. Nonethe- 
less, they have predicted that significant 
developments in the learning style area 
would: (a) involve noncognitive learner 
characteristics; (b) result from testing spe- 
cifically formulated hypotheses; and (c) be 
related to curiosity theory (Beswick & 
Tallmadge, 1971). In general, the results 
here bear out these predictions. 
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The purpose of this investigation was to evaluate the effectiveness of 
a preinstruction retention index. This index was designed for utiliza- 
tion in a strategy to maximize recall of mathematical concepts by pre- 
dicting the idiosyncratic number of examples per mathematical concept 
required by each student. The index score was determined for 53 eighth- 
grade students who were then assigned to one of three treatments: (a) 
retention index—students were presented concepts followed by an 
idiosyncratic number of examples as determined by their preinstruc- 
tion retention index score; (b) choice—each student was allowed to de- 
termine the number of examples he would receive per concept; (c) 
fixed—all students were given three examples per concept. The results 
indicated that females in the retention index treatment retained signifi- 
cantly more than females in either the choice or fixed group, while males 
in the choice treatment retained more than males in the other groups. 


Much research effort has been devoted to 
the exploration of variables related to re- 
tention of learned material, This has been 
due, at least partially, to disturbing reports 
that up to 66% of the concepts learned in 
high school and college courses are forgot- 
ten within 2 years (Pressey, Robinson, & 
Horrocks, 1959). The problem is even more 
serious in curriculum areas such as mathe- 
matics that are hierarchial in nature; suc- 
cessful acquisition of new concepts is very 
much dependent on retention of previously 
learned concepts. Findings such as Layton’s 
(1932) that after a 1-year interval only 
3314575 of initial algebra material was re- 
tained, suggest that considerable time is 
wasted relearning relevant prerequisite ma- 
terial. 

In the search for ways to improve reten- 


* The author wishes to express her appreciation 
to Walter Dick for his valuable methodological 
and editorial comments and Harold F. O'Neil, Jr. 
for his suggestions. This research was supported 
in part by a contract to Duncan N. Hansen from 
the Office of Naval Research (N0014-68-A-0494). 
A paper based on this research was presented to 
the 1970 American Education Research Association 
meeting, Minneapolis, Minnesota. 
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tion, one variable that has fairly consis- 
tently been found to have a positive effect 
is practice. Evidence is fairly conclusive for 
both nonmeaningfully related material 
(e.g., Tulving, 1968; Underwood, 1964) and 
meaningfully related material (e.g. Ausu- 
bel & Youssef, 1965; Reynolds & Glaser, 
1964). Unfortunately, these findings have 
been too often uncritically applied to class- 
room instruction. For example, in mathe- 
matics the number of practice examples a8- 
signed to students learning a new concept or 
tule has typically been uniform for all stu- 
dents and determined by the teacher or 
textbook. Inadequate attention has been 
given to individual differences and the pos- 
sibility that different students may require 
different numbers of examples. Any given 
amount of practice is probably uneconomi- 
cal overlearning for some students (Shay, 
1961) and inefficient underlearning for oth- 
ers. 
The growth and development of compu- 
ter-assisted and computer-managed instruc- 
tion has resulted in a viable strategy for 
individualization of instruction. Using cor 
puter-assisted and computer-managed A 
struction, student data can be collected, 
stored, and analyzed, and the instruction 


466 


RETENTION INDEX FOR MATHEMATICS 


adjusted to individual differences. The 
problem is how to determine the amount of 
practice (or in the case of mathematics the 
number of examples) needed by each stu- 
dent in order to individualize appropriately 
the instruction. 

The present study was conducted to com- 
pare three methods of determining amount 
of practice received on a unit of mathemat- 
ies presented via computer-assisted instruc- 
tion in order to determine which method 
resulted in the most efficient retention of 
material as measured by a delayed criterion 
measure. The three methods under study 
were: (a) the retention index method—de- 
termining the idiosyncratic number of ex- 
amples needed by a student per mathemati- 
cal concept on the basis of his score on à 
preinstruction retention index; (b) the 
choice method—allowing students complete 
choice on each concept as to the number of 
examples to be received; and (c) the fixed 
method—presenting each student with a 
constant number of examples (three) per 
mathematical concept. 

RU was hypothesized that due to the em- 
pirieally based individualizing of the in- 
struction, the retention index group would 
perform better on measures of both immedi- 
ate and delayed retention than either the 
choice group or the fixed group. Also, in the 
belief that (a) the choice group would tend 
to underestimate the number of examples 
needed and (b) that three examples would 
not be sufficient for the majority of students 
in the fixed group, it was hypothesized that 
the retention index group would make fewer 
Acquisition errors (errors on response 
frames within the computer-assisted in- 
struction program). 


METHOD 
Subjects 


: "This study was conducted using 53 eighth-grade 
ident (27 females and 26 males) from the 
i orida State University School” All but one of 

de subjects had a measured intelligence rated as 
average or above; the range of beta IQs (conver- 

Sion of the OTIS raw scores) was from 91 to 129 
with a mean of 115. 

———— 
2 se author expresses sincere thanks to E. H. 
hin and of the Florida State University School for 

18 cooperation in scheduling students. 
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Apparatus 


The learning materials were presented by an 
IBM 1500 Computer-Assisted Instruction system. 
Terminals for this system consist of a cathode-ray 
tube, a light pen, and a keyboard. The terminals 
were located in an air-conditioned sound-deadened 
room. The computer-assisted instruction system 
administered the learning task and recorded the 
students’ responses. 


Learning Materials 


Preparation of learning materials involved three 
stages: development of the preinstruction retention 
index criterion test; development of the pre- 
instruction retention index; and development of 
an instructional program on polynomials and the 
accompanying immediate- and delayed-retention 
criterion tests. 

The preinstruction retention index criterion test 
was developed in the following manner. Thirty 
high school mathematics texts (elementary and 
advanced) were secured, From these sources 40 
concepts were selected, these being a representa- 
tive sample of all areas covered in the texts. Con- 
cepts were selected on the basis of two criteria: 
(a) nonfamiliarity—concepts were selected for 
which most eighth-grade subjects would have no 
previous knowledge; and (b) relative independ- 
ence—concepts chosen, while coming from areas 
unknown to the subjects, could be learned inde- 
pendently of any specific entry skills; they were 
nonhierarchial in that they could be learned in and 
of themselves. 

The concepts selected were randomly ordered 
and a criterion test was developed. The test con- 
sisted of one item per concept; items were of the 
recall type, requiring the filling in of the appro- 
priate term or a computation using the appro- 
priate algorithm. In order to validate its non- 
familiarity, the 40-item criterion test was field 
tested on a high ability (as defined by the school) 
group of public school eighth graders. Since the 
entire population of eighth graders at the Univer- 
sity School was required for the actual study, and 
since they were of above-average intelligence, it 
was felt that a high ability group of eighth graders 
from another junior high school would make a 
reasonable validation group. To insure the non- 
familiarity of the content, any test item to which 
more than 5% of the students responded correctly 
was eliminated from the measure. The result was 
a 30-item criterion test. 

Following the development of the preinstruction 
retention index criterion test, the index itself was 
developed, coded, and entered on the IBM 1500 
system. It was developed by randomly dividing 
the 30 concepts into five groups. The concepts in 
each of the groups were presented following the 
same general format. For each concept three 
frames were written: a frame presenting the con- 
cept, a frame requiring the reading of an example, 
and a frame requiring the student to demonstrate 
recall of the concept (response frame). Each of the 
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concepts in the first group was presented with 1 
example, in the second group with 2 consecutive 
examples, in the third with 5 consecutive examples, 
in the fourth with 10 consecutive examples, and in 
the fifth with 15 consecutive examples. The specific 
numbers of examples selected (1, 2, 5, 10, and 15) 
were chosen in order to provide discrimination 
for smaller numbers of examples. It was felt that 
including 15 examples would eliminate the pos- 
sibility of some students reaching a ceiling. The 
completed preinstruction retention index consisted 
of 30 randomly ordered concepts, each accom- 
panied by 1, 2, 5, 10, or 15 examples and a response 
frame. Maximum administration time was found 
to be 1 hour. 

In order to test the viability of the index, an 
instructional unit on polynomials was developed. 
The main reason for this choice was that the 
author had previously written & programmed text 
on polynomials that had been successfully field 
tested. Second, it was determined that polynomials 
was a topic not yet encountered by the University 
School students, but one for which they possessed 
the necessary entry skills. The instructional pro- 
gram consisted of a total of 16 concepts on poly- 
nomials, These concepts were presented in the 
format described for the preinstruction retention 
index, that is, presentation of a concept, examples, 
and a response frame, Fifteen examples were 
written for each concept; the logic of the program, 
however, permitted each student to have a dif- 
ferent number of examples as determined by the 
treatment to which he was assigned. Immediate 
and delayed criterion tests were developed for the 
program using items comparable, but not identi- 
cal, to response frames in the instructional pro- 
gram. The two criterion tests were identical except 
for the actual numbers used. 


Procedure 


All students worked through the index in groups 
of 12 and 13; as the index was designed to im- 
prove delayed retention, its criterion test was 
administered 1 week later. Twenty-one of the 
students were randomly selected to be in the 
retention index group. For those selected students, 
retention curves were plotted from the preinstruc- 
tion retention index criterion test scores, Figure 1 
demonstrates a typical curve. The vertical axis is 
the percentage correct on the preinstruction reten- 
lion index retention test and the horizontal axis 
shows the number of examples for which this 
percentage was achieved. The student in Figure 1 
correctly recalled: none of the concepts presented 
with 1 example, 3376 of the concepts presented 
with 2 examples, 50% of the concepts with 5 ex- 
amples, 100% with 10 examples, and 50% with 15 
examples. The pattern shown by the Student in 
Figure 1 held for all the data graphed. For all 
students a retention peak was reached at either 5 
or 10 examples followed by a decline. This held 
true regardless of the maximum percentage of 
retention (which ranged from 33% to 100%). On 
the basis of these curves, the optimal number of 
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examples for each student in the retention index 
group was determined. For example, it was deter- 
mined that the student in Figure 1 required 10 
examples. Students not in the retention index 
group were randomly assigned to either the choice 
group or the fixed group. 

Two weeks later, the students received the in- 
struction on polynomials. All 53 students received 
exactly the same instruction, differing only in 
number of examples received. Students in the 
retention index group were presented each con- 
cept followed by the number of examples that 
had been determined by the preinstruction reten- 
tion index, and then were branched to a response 
frame. Students in the choice group were presented 
with a concept and an example, and were then 
presented with an option to see another example 
(up to 15) or to be branched immediately to the 
response frame. Students in the fixed group were 
presented a concept, three examples, and were 
then branched to the response frame. The fixed 
group was instructed in the manner typically 
adopted by traditional instruction, with each sub- 
ject receiving the same number of examples. All 
students were presented one response frame per 
concept and were given just one opportunity to 
respond, regardless of whether they correctly or 
incorrectly recalled that concept. Consequently, 
the range of possible acquisition errors was from 
0 to 16. 

It took each student 3 days to complete the in- 
structional program with a maximum of 1 hour per 
day. The program was divided into three parts 80 
that students could work through it during their 
regularly scheduled mathematics period. Due to 
the nature of the treatment groups, some students 
finished each session sooner than others. At the 
conclusion of each session the students were 
administered an immediate criterion test on the 
material covered during that session. The scores 
on all three immediate retention measures were 
combined to give an immediate retention total. 
One week following the last session all students 
were administered the delayed retention criterion 
test. 


RESULTS 


A correlation of .35 (p < .01) between 
sex and delayed retention indicated that fe- 
males retained more of the instruction 
material than males. As males are typically 
found to perform better on mathematical 
tasks, this correlation suggested that the 
treatments might be differentially effective 
for males and females; therefore, all analy- 
ses included sex as a factor. Even thou! 
original assignment to treatment was 1 
done on the basis of sex, there were appro 
imately the same number of males (26) 87' 
females (27) in the experiment, and 
were randomly assigned to one of the 
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Number of Examples 


Fic. 1. Typical delayed retention curve for the preinstruction retention index 
criterion measure. 


treatments, Further, although the resulting 
E Sizes were not equal, they were judged 
: be sufficiently tolerable to permit analy- 
ls and interpretation (see Table 1); conse- 
Heat data were analyzed using a general 
mear hypothesis model in which calcula- 
lons were adjusted to accommodate une- 
Qual sample sizes. 
; The results of a 3 x 2 (treatment by sex) 
actorial analysis of variance on immediate 
retention scores confirmed the suspected in- 
Anaction (F = 7.44, df = 2/41, p < QD. 
Fu ication of the Neuman-Keuls proce- 
m o the six cell means showed that for 
S ales, the retention index method resulted 
Significantly better immediate retention 


scores than both the choice method and the 
fixed method (p < .05). Also, males in the 
choice group performed significantly better 
than females in the choice group. The 
means (see Table 1) show that males in the 
choice group tended to perform better than 
males in both the retention index group and 
the fixed group. 

The results of a 3 X 2 (treatment by sex) 
factorial analysis of variance on delayed 
retention scores also revealed a significant 
interaction between treatment and sex (F 
= 6.15, df = 2/47, p < 01) as well as a 
significant sex effect (F — 5.15, df — 1/47, 
p < .05). Application of the Neuman-Keuls 
procedure to the six cell means showed that 
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Fomales 


Choice Fixed 


Treatment Conditions 
Fra. 2. Sex X Treatment interaction for delayed retention data (N = 53). 


females in the retention index group re- 
tained significantly more than females in 
both the choice group and the fixed group; 
they also retained significantly more than 
males in the retention index group and the 
fixed group (p < .05), but not males in the 
choice group (see Figure 2). Again, the 
means show that, to an even greater degree 
than for immediate retention, males in the 
choice group tended to retain more than 
males in both the retention index group and 
the fixed group (see Table 1). 

The results of a3 X 2 (treatment by sex) 
factorial analysis of variance on acquisition 
errors revealed a significant treatment ef- 
fect (F = 5.78, df = 2/47, p < .01) as well 
as a significant Treatment X Sex interac- 
tion (F = 3.33, df = 2/47, p < .05). Appli- 
cation of the Neuman-Keuls procedure to 


the six cell means showed that females in 
the fixed group made significantly more et 
rors than females in the retention index 
group. Also, as hypothesized, the a 
(see Table 2) show that the males and jn 
males in the retention index group mace 
fewer errors than subjects in any other 
treatment by sex combination. " 
It could be argued that delayed ten 
performance differences were attri : 
to a linear relationship with numbers of ex 


amples rather than treatment effects; how- | 


ever, a correlation of .18 did not support 
this hypothesis. Furthermore, a 2 2 ir 
(number of examples [5 or 10] by sex) " 

torial analysis of variance on the e 

retention scores of the retention index d D 
did not show a significant effect for n i 
of examples. As would be expected, 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS FOR 
IMMEDIATE AND DELAYED RETENTION 
BY GROUP AND BY SEX 


Immediate yed 
retention retention 
Group 
Male Female Male Female 
Retention 
index 
M 9.00 | 12.56 4.42 | 9.44 
SD 3.95 2.46 1.88 | 3.13 
N 12 9 12 9 
Choice 
M 10.83 5.33 7.00 | 5.77 
SD 2.79 5.27 1.67 | 3.19 
N 7 9 7 9 
Fixed 
M 9.13 7.1 4.62 | 5.89 
SD 2.90 2.20 2,83 | 2.90 
N 7 9 7 9 
Note.—N = 53. 


main effect for sex was significant (F — 
1408, df = 1/17, p < .01). Application of 
the Neuman-Keuls procedure to the four 
cell means revealed that females with 10 
examples retained significantly more than 
males with both 5 and 10 examples, but not 
Significantly more than females with 5 ex- 
amples. 


Discussion 


The results strongly suggest that there 
are differential treatment effects depending 
on the sex of the subject; the retention 
Index method was significantly better for 
females, while the choice method was better 
for males (for both immediate and delayed 
Tetention). Previous research on sex differ- 
ences has indicated that superiority of ei- 

er sex is a function of the nature of the 
Material (Dietze, 1932; Layton, 1932; 
Revay, 1938). The present study suggests 
hat, similarly, superiority of instructional 
Method is dependent on sex. 

This study also suggests that traditional 
Methods of mathematics instruction, 
Whereby all students receive the same 
Amount, Of practice, are not conducive to 
promoting retention. Students instructed by 

e fixed method of allocating practice ex- 
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amples were inferior to students in the re- 
tention index and choice groups on measures 
of acquisition, immediate retention, and de- 
layed retention. The suggestion is that the 
high levels of forgetting that are often re- 
ported, may be a result of the utilization of 
inadequate strategies for providing practice. 
Ausubel (1968) similarly concluded that if 
adequate attention were paid to such consid- 
eration as optimal review, students might 
retain over a lifetime most of the important 
ideas they learn in school. 

It is interesting to note that in the group 
in which students were given the freedom to 
choose the number of examples they would 
see for each concept, the average number of 
examples chosen was three; therefore, the 
males in the choice group and the males in 
the fixed group received the same average 
number of examples, and yet males in the 
choice group retained more than males in 
the fixed group. It is reasonable to assume, 
however, that while the average number of 
examples chosen by the choice group was 
three, more were chosen for more difficult 
concepts and fewer were chosen for less dif- 
ficult items. Analogously, for the fixed 
group, three examples might have been too 
few for some concepts and too many for 
others. The previously noted fact that the 
preinstruction retention index curves exhib- 


TABLE 2 


MEANS AND STANDARD DEVIATIONS FOR NUMBER 
or AcquisrTION Errors BY GROUP 


AND BY SEX 
Acquisition errors 
Group 
Male Female 
Retention index 
M 2.67 2.33 
SD 2.99 2.29 
N 12 9 
Choice 
M 6.00 3.89 
SD 2.76 1.88 
N 7 9 
Fixed 
M 3.75 6.33 
SD 2.31 3.00 
N 7 9 
Note.—N = 53. 
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ited sharp drops after a certain number of 
examples suggests that too many examples 
may interfere with retention. It would be 
fruitful in future research to assess the dif- 
ficulty of individual concepts and to then 
compare performance between the choice 
and fixed groups. 

The preinstruction retention index would 
seem to be a reasonably effective instrument 
for determining the number of examples 
needed by females. The fact that there was 
no difference in delayed retention perform- 
ance between subjects receiving 5 examples 
and subjects receiving 10 examples suggests 
that the index measured some variable 
heretofore unassessed. One limitation of the 
index was its lack of discrimination be- 
tween 5 and 10 examples. It was originally 
believed that the index should be designed 
to allow for finer distinction at the lower 
end of the index. It is now suggested that 
future research with these materials use a 
revision of the preinstruction retention 
index wherein the concepts are presented 
with 4, 6, 8, 10, or 12 examples rather than 
the original 1, 2, 5, 10, and 15. The index 
should further be revised to take item diffi- 
culty into account; this would result in a 
subject receiving a certain number of exam- 
ples for less difficult concepts and a greater 
number of examples for more difficult con- 
cepts. Reasonably, this would not increase 
learning time but would increase delayed 
retention. Finally, both the preinstruction 
retention index and the instructional pro- 
gram should be revised so that subjects are 
required to respond after each example dur- 
ing the learning session, rather than simply 
reading and studying each example; this 
would logically increase delayed retention 
scores regardless of treatment. It is a value 
judgment as to whether the increased de- 
layed retention would be worth the addi- 
tional time involved. It may be argued that 
the additional time is worth it, especially for 
curriculum areas such as mathematics in 
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which the content is hierarchial and sue- 
cessful acquisition of a new concept is very 
much dependent on retention of one or more 
previously learned concepts. 

In conclusion, the results do suggest an 
interesting phenomenon that as yet has not 
been effectively investigated. Up to this 
time there has been no attempt to differen- 
tiate instruction on the basis of sex; pre- 
vious efforts have been in the direction of 
differentiation on the basis of various apti- 
tude and personality variables. It is very 
possible that an important aspect of indi- 
vidualization has been too long overlooked. 
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EFFECTS OF STRESS ON STATE ANXIETY AND PERFORMANCE 
IN COMPUTER-ASSISTED LEARNING? 


HAROLD F. O'NEIL, JR? 


Florida State University 


The effects of stress on state anxiety (A-State) and on performance 
in a computer-assisted mathematical task were investigated for female 
college subjects who differed in anxiety proneness (A-Trait). In the 
stress condition, subjects received negative feedback about perform- 
ance; the subjects in the nonstress condition were given a brief rest 
period. The high A-Trait subjects in the stress condition showed a sig- 
| nificantly greater initial increase in A-State than did the low A-Trait 

subjects. During the learning task, a different pattern of changes was 
noted. Level of A-State for high A-Trait subjects in the stress condi- 
tion decreased, while for the low A-Trait it remained relatively con- 
stant. In the nonstress condition, the changes in A-State for high and 
low A-Trait subjects were parallel. High A-State subjects made sig- 
nificantly more errors than low A-State subjects on the easier sections 
of the computer-assisted instruction task, but not on the most diffi- 


t 


i 


cult, 


4 The general purpose of this study was to 
investigate the effects of stress on state anx- 
lety and computer-assisted learning for col- 
lege students who differed in anxiety prone- 
hess (trait. anxiety). Mathematical concepts 
were presented by computer-assisted in- 
struction (CAI) which permitted the pres- 
entation of these complex learning materi- 
als under carefully controlled conditions. In 
general, in CAI systems, a computer is em- 
ployed to control the selection, sequencing, 
and evaluation of instructional materials 
(Fishman, Keller, & Atkinson, 1968). 
Hypotheses about the effects of anxiety 
on learning were derived from Drive 


` This paper is based on a doctoral dissertation 
SPOT to the Department of Psychology, 
gorda State University, Tallahassee, Florida. 
P thanks are due to Charles D. Spielberger, 
dc chaired the author's doctoral committee, and 
m to Duncan N. Hansen, who provided invalu- 
able advice and assistance. 
Portions of the data were presented at the 
MEQUE, of the American Educational 
Ssociation, Minneapolis, March 1970. This re- 
Search was supported by the Office of Naval Re- 
Search, Contract No. N00014-68-A-0494. 
p. dequests for reprints should be sent to Harold 
s Peik Jr, Department of Educational Psy- 
ology, University of Texas, Austin, Texas 78712. 


Theory (Spence, 1958; Taylor, 1956) and 
Trait-State Anxiety Theory (Spielberger, 
1966, 1971). Drive Theory predicts that 
the performance of high-anxious subjects 
will be inferior to that of low-anxious sub- 
jects on complex or difficult learning tasks 
in which competing error tendencies are 
stronger than correct responses. In contrast, 
on simple learning tasks, in which correct 
responses are dominant relative to incorrect 
response tendencies, it would be expected 
that the performance of high-anxious sub- 
jects would be superior to that of low-anx- 
ious subjects. The findings of many studies 
provide empirical support for Drive Theory 
(e.g., Montague, 1953; Spence, 1964; Spence 
& Spence, 1966). 

Research on anxiety and learning guided 
by the Spence-Taylor Drive Theory has 
suffered from ambiguity with regard to the 
status of anxiety as a theoretical concept. 
According to Speilberger’s Trait-State Anx- 
iety Theory, it is essential to distinguish 
between anxiety as a transitory state or as 
a relatively stable personality trait. State 
anxiety (A-State) refers to a complex con- 
dition or response that varies in intensity 
and fluctuates over time. This condition is 
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characterized by feelings of tension and ap- 
prehension and by activation of the auto- 
nomie nervous system. Trait anxiety (A- 
Trait) refers to individual differences in 
anxiety proneness. High A-Trait individu- 
als are more disposed than those low in A- 
Trait to perceive certain types of situations 
as dangerous, particularly situations that 
involve some threat to the individual’s 
self-esteem. 

O’Neil, Hansen, and Spielberger (1969) 
investigated the relationship between A- 
State and performance on a CAI task for 
college males with extreme scores on the 
A-Trait scale of the State-Trait Anxiety 
Inventory (STAI; Spielberger, Gorsuch, & 
Lushene, 1970). Difficult and easy CAI 
learning materials were presented by an 
IBM 1500 system which also presented the 
STAI A-State scale before, during, and 
after the learning task. The findings of an 
earlier study (O’Neil, Spielberger, & Han- 
sen, 1969) were confirmed in that (a) A- 
State scores increased while subjects 
worked on difficult materials and decreased 
when they responded to easy materials; and 
(b) high A-State subjects made signifi- 
eantly more errors on the diffieult materials 
than low A-State subjects. While there was 
no relation between A-Trait and perform- 
ance, high A-Trait subjects responded 
throughout the learning task with higher 
levels of A-State than low A-Trait subjects. 

On the assumption that the CAI situation 
involved some threat to self-esteem, 
Trait-State Anxiety Theory would predict 
that the magnitude of increase in A-State 
would be greater for high A-Trait subjects 
than for low A-Trait subjects. This expecta- 
tion was not confirmed in the study in 
which subjects were selected on the basis of 
A-Trait scores. A possible explanation is 
that while the CAI task was stressful be- 
cause it was difficult, it was not necessarily 
more stressful for high A-Trait subjects 
than for low A-Trait subjects because it did 
not evaluate the adequacy of the subject’s 
performance relative to others. If explicit 
negative evaluations concerning perform- 
ance were given by the computer, then high 
A-Trait subjects might be expected to per- 
ceive the CAI situation as more threatening 
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than would low A-Trait subjects, and to re- 
spond with higher levels of A-State. 

In the present study, subjects in the 
Stress condition were given negative feed- 
back regarding their performance on a CAI 
learning task, whereas subjects in the non- 
Stress condition received neutral feedback. 
Scores on the STAI A-State scale and errors 
on the CAI learning task were the principal 
dependent variables. Hypotheses derived 
from Trait-State Anxiety Theory (Spiel- 
berger, 1971) were: (a) the stress condition 
would evoke higher levels of A-State than 
the nonstress condition; (b) high A-Trait 
subjects would display higher levels of A- 
State than low A-Trait subjects throughout 
the experimental task; (c) during the per- 
formance period, high A-Trait subjects in 
the stress condition would respond with 
greater increments in A-State intensity 
than low A-Trait subjects; and (d) for the 
nonstress condition, increments in A-State 
were expected to be independent of level of 
A-Trait. 

Assuming that level of Drive is a function 
of A-State, the following hypotheses were 
derived from Drive Theory; (e) for the 
more difficult sections of the learning task, 
high A-State subjects would make more er- 
rors than low A-State subjects; and (f) on 
the easier sections of the learning task, high 
A-State subjects would make fewer errors 
than low A-State subjects. 


METHOD 


Selection of Subjects 


The STAI A-State and A-Trait scales were 
group-administered with standard instructions to 
583 students enrolled in the introductory psy- 
chology course at Florida State University. From 
this population, females whose STAI A-Trait 
scores were in the upper and lower 20% of the 
distribution were invited to participate in an a 
periment on computer-assisted learning. Cut-0! 
scores for the high A-Trait and low A-Trait sub- 
jects were 45 and 31, respectively. The subjects 
who volunteered for the experiment were assigne 
to either the stress or nonstress conditions In és 
manner such that mean preexperiment A-Trai: 
and A-State scores for high A-Trait and low Ac 
Trait subjects were comparable (O'Neil, 1969). 
total of 73 subjects were run in groups of from 
8 to 13, Of these, 7 subjects were eliminated x 
cause of missing data, and 2 additional subjecs. 3 
were subsequently dropped from the low A-Trei 
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nonstress group to provide for equal cell fre- 
quencies. 


Apparatus 


The materials were presented by an IBM 1500 
Computer-Assisted Instruction System which also 
administered the STAI A-State scales and recorded 
the subjects’ responses. Terminals for this system 
consist of a cathode-ray tube, a light pen, and a 
keyboard. The terminals were located in an air- 
conditioned, sound-deadened room. 


Learning Materials 


The CAI task, which consisted of difficult 
mathematics learning materials relating to the 
field properties of complex numbers, was adapted 
from the task used by O’Neil, Hansen, and 
Spielberger (1969). This task was divided into 
three sections, labeled A, B, and C, with five prob- 
lems per section. Programming logic required the 
subjects to solve each successive problem correctly 
before they could progress to the next one. 


Anxiety Measures 


The State-Trait Anxiety Inventory (Spielberger 
et al, 1970) was used to measure both A-Trait 
and A-State. The 20-item STAI A-State scale was 
Eiven before and after the learning task. In ad- 
dition, a short form of the A-State scale, con- 
sisting of the five items with the highest item-re- 
mainder correlations in the STAI normative 
sample (Spielberger et al, 1970), was given im- 
mediately after each of the three sections of the 
learning task. 


Experimental Procedure 


The experimental procedures were divided into 
three periods: (a) a pretask period during which 
subjects read instructions and learned how to 
Operate the terminal; (b) task period in which 
Subjects learned the field properties of complex 
numbers and received differential feedback regard- 
ing their performance; and (c) a posttask period 
in which subjects were interviewed and debriefed. 
These procedures are described below and in more 
detail in O'Neil (1969). 

Pretask period. After being seated at a CAI 
terminal, each subject in the nonstress condition 
was given orienting instructions, including: 
_ 
the [he five-item A-State scale administered by 

2e computer during the CAI learning task con- 

sisted of the items from the 20-item STAI A-State 
Seale which had shown the highest item-remainder 
BE coefficients in previous research (Spiel- 
erger, Gorsuch, & Lushene, 1970). These items 

Were: (a) "I am tense”; (b) “I feel at ease"; (c) 
am relaxed"; (d) “I feel calm”; (e) “I am jit- 

i ty.” The subject responded to each item by Tat- 
oT, imself on the following 4 point scale: (a) 

Yot at all”; (b) “Somewhat”; (c) “Moderately 
$0"; (d) “Very much so.” 
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Tt has been found that success in this program 
does not require mathematical or quantitative 
ability; it requires, instead, the ability to make 
the same kind of abstractions and generalizations 
that you are expected to make in many college 
courses, 


The same instructions were given to the subjects 
in the stress condition, but the following sentence 
was added: 


The computer has been programmed to evaluate 
your performance as you progress through the 
learning materials. 

Each subject was then asked to read the de- 
scription of the operation of the CAI terminal, and 
the experimenters answered questions and demon- 
strated the procedures as needed. After “signing 
on,” all subjects responded to the STAI A-State 
scale which was given with standard instructions 
(“indicate how you feel right now"). 

Task period. During the task period, all sub- 
jects worked through the same CAI learning ma- 
terials, each progressing at her own speed. Im- 
mediately after each of the three sections of the 
learning task, the short form of the STAI A-State 
scale was given with instructions to "indicate how 
you felt during the section of the course you have 
just finished." Following the third problem in each 
section of the task, subjects in the stress condition 
received negative feedback while those in the non- 
stress condition were given a rest period. 

Since it was anticipated that errors would de- 
crease across the task (O'Neil, Hansen, & Spiel- 
berger, 1969), it was necessary to develop three 
different negative feedback statements. The first 
of these evaluative statements focused upon ade- 
quacy of subjects’ performance, the second on the 
speed and accuracy of performance, and the third 
primarily on speed. Each negative feedback state- 
ment was presented by the computer for 15 
seconds under the title “Evaluation.” Subjects in 
the nonstress condition were instructed by the 
computer to “Take a brief rest.” This rest state- 
ment was presented for 15 seconds at exactly the 
same points during the learning task that the stress 
groups were given negative feedback. 

Posttask period. After each subject completed 
the learning task, she was given the final A-State 
measure (with standard instructions). 


RESULTS 


Changes in STAI A-State during the 
Experiment 

Changes in level of A-State were evalu- 
ated in a three-factor analysis of variance 
design in which A-Trait, experimental con- 
ditions, and periods were the independent 
variables, with repeated measures on the 


last factor. 
The statistical analyses of the A-State 
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data were based on the short form of the 
STAI A-State scale. For these analyses, the 
five items in the short form were extracted 
from the A-State scales given before and 
after the experimental task. For the pretask 
and posttask A-State measures the correla- 
tions between the five-item scales and the 
remaining 15 items were .75 and .84, respec- 
tively. The alpha reliabilities for the three 
five-item A-State scales given during the 
learning task were .86, .88, and .89, respec- 
tively; the alpha reliabilities for the 20- 
item A-State scales given before and after 
the learning task were .89 and .94. 

The means and standard deviations of 
the STAT A-State scores of the high A-Trait 
and low A-Trait subjects in the stress and 
nonstress conditions are reported in Table 1 
for the pretask period, the three sections of 
the learning task, and the posttask period. 
The most important finding in the analysis 
of variance for these data was the signifi- 
cant A-Trait by conditions by periods triple 
interaction (F = 3.4, df = 4/20, p < .05) 
which indicated that the A-State scores of 
high A-Trait and low A-Trait subjects were 
differentially influenced by the stress and 
nonstress conditions. In addition, the main 
effects of A-Trait (F = 8.1, df = 1/60, p < 
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-01) and periods were significant (F = 38.1, 
df = 1/60, p < .01). To clarify the triple 
interaction, two subsidiary analyses were 
run: the first evaluated the subjects’ initial 
reactions to the CAI learning task; the sec- 
ond examined change in A-State during the 
task. 


Initial Reactions to the CAI | 
Learning Task 


In order to ascertain initial reactions to 
the learning task, changes in A-State from 
the pretask period to Task A were examined 
as a function of A-Trait and experimental 
conditions. The mean STAI A-State scores 
obtained in the pretask period and in Task 
A are presented in Figure 1. As in the overall 
analysis of variance, a significant triple 
interaction involving A-Trait, experimental 
conditions, and periods was obtained for 
these data (F = 7.6, df = 1/60, p < .01) 
indicating that the initial reactions of high 
A-Trait and low A-Trait subjects were dif- 
ferentially influenced by the stress and non- 
stress experimental conditions. In addition, 
the main effects of both A-Trait (F = 8.9, 
df = 1/60, p < .01) and periods (F = 1223, 
df = 1/60, p < .01), and the A-Trait by 


TABLE 1 
Mean A-SrATE Scores ror Hiem A-Tratr AND Low A-Trait SUBJECTS IN THE 
Stress AND NoNsTARESS CONDITIONS 


Pretask 


Long form | Short form 


All 


M 35.4 9.0 

SD 9.7 3.5 
High A-Trait Stress 

M 37.2 9.2 

SD 10.8 3.9 

High A-Trait Nonstress 

M 38.8 10.1 

SD 9.2 3.9 
Low A-Trait Stress 

M 34.8 8.7 

SD 10.6 3.5 
Low A-Trait Nonstress 

M 30.7 8.0 

SD 6.6 2.7 


Learning task Posttask 

ees boi Saas 
Task A Task B Task C Long form | Short form 
13.1 11.6 11.5 10.8 41.2 
3.4 3.7 3.9 3.5 12.2 
15.9 13.2 12.4 12.3 48.1 
3.2 4.2 3.9 3.5 12.8 
13.8 12.8 12.4 11.4 43.1 
2.9 3.1 3.2 2.9 10.4 
11.6 11.5 11.6 10.4 40.4 
2.4 gr 3.6 3.7 13.3 
11.5 9.4 9.8 8.9 33.2 
3.3 3.3 4.4 4.4 7.4 


~ 
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Fic. 1. Mean A-State scores for the initial reactions of high A-Trait and low A-Treit 


subjects. 


Periods interaction (F = 58, df = 1/60, 
P < 01) were significant. 
ü In order to determine how stress differen- 
ae influenced A-State for subjects who 
me in A-Trait, separate two-factor 
hi eee of variance were computed for the 
De A-Trait and low A-Trait subjects. For 
dit high A-Trait subjects, a significant con- 
ditions by periods interaction (F = 9.1, df 
1/30, p < .05) and a periods main effect 
tai SM df = 1/30, p < .01) were ob- 
ide d. The conditions by periods interac- 
ve indicated that high A-Trait subjects 
owed a greater increase in A-State in the 
j 


stress condition than in the nonstress condi- 
tion, as may be noted in Figure 1 

For the low A-Trait subjects, only the 
periods main effect was significant (F = 
46.5, df = 1/30, p < 01). In the absence of 
a significant conditions by periods interac- 
tion, this finding indicated that the increase 
in A-State scores of the low A-Trait sub- 
jects was about the same in the stress and 
nonstress conditions. Thus, for high A-Trait 
subjects, stress produced a greater increase 
in A-State than did the nonstress condition, 
whereas the increase in A-State for low A- 
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Trait subjects was comparable in both ex- 
perimental conditions. 


Changes in A-State during the 
Experimental Task 


To evaluate changes in A-State during 
learning, the STAI A-State scores obtained 
in Sections A, B, and C of the CAI task were 
examined. Figure 2 presents the mean A- 
State scores for high A-Trait and low A- 
Trait subjects in the stress and nonstress 
conditions. The overall three-factor analysis 
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of variance for these data revealed a signifi- 
cant triple interaction, which involved A- 
Trait, experimental conditions and periods 
(F = 8.2, df = 2/120, p « .01). In addition, 
the main effects of A-Trait (F — 8.9, df = 
1/60, p < .01) and periods (F = 17.6, df = 
2/120, p < .01) were significant. These find- 
ings indicated that during the learning task 
the stress and nonstress conditions had dif- 
ferential effects on level of A-State for high 
A-Trait and low A-Trait subjects. 

To clarify this triple interaction, separate 
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Fic. 2. Mean A-State scores 
A-Trait subjects. 


obtained during the 


learning task for high A-Trait and low 


two-factor analyses of variance were com- 
puted for the high A-Trait and the low A- 
- "Trait subjects. For the high A-Trait sub- 
jects, the conditions by periods interaction 
was significant (F = 3.7, df = 2/60, p < 
05) as was the periods main effect (F = 
12.2, df = 2/60, p < .01). The interaction 
may be attributed to two sources. First, 
level of A-State was higher in Task A for 
the high A-Trait subjects in the stress 
condition than for the high A-Trait non- 
stress subjects, apparently due to the initial 
| impact of negative feedback on level of A- 
' State (see Figure 1). Second, high A-Trait 
subjects in the stress condition showed a 
_ greater decline in A-State during the learn- 
ing task so that, by Task C, the level of 
A-State of these subjects was about the 
same as that of the high A-Trait subjects in 
the nonstress condition (see Figure 2). 
For low A-Trait subjects, the conditions 
by periods interaction (F = 5.9, df = 2/60, 
p < 01) and the main effect of periods (F 
= 5.9, df = 2/60, p < .01) were significant. 
These findings may be attributed to two 
sources. First, the low A-Trait subjects in 
_ both experimental conditions showed com- 
parable initial reactions to the task (Figure 
| 1). Second, while level of A-State for low 


| A-Trait subjects in the stress condition re- 
“mained relatively stable throughout the 
learning task, the A-State scores for low 
A-Trait subjects in the nonstress condition 
i decreased, as may be noted in Figure 2. 
Thus, level of A-State for the low A-Trait 
Subjects in the stress condition was main- 
tained at initial levels, presumably due to 
the impact of negative feedback, whereas 
the decline in A-State for low A-Trait sub- 
; jects in the nonstress condition seemed to 
~ reflect adaptation to a difficult task. 
_In summary, in the stress condition, the 
high A-Trait subjects demonstrated a 
, greater initial increase in A-State from pre- 
` task levels than did low A-Trait subjects. 
During the learning task, high A-Trait sub- 
. Jects in the stress condition showed a 
/ marked decline in AtState whereas level of 
A-State remained relatively constant for 
low A-Trait subjects. In the nonstress con- 
dition, the changes in A-State for high A- 
Trait and low A-Trait subjects were quite 
- Similar. Both groups showed almost the 


EFFECTS IN COMPUTER-ASSISTED LEARNING 


479 


same increase in A-State from pretask 
levels and approximately parallel changes 
in the level of A-State during the learning 
task. 


Effects of Stress, A-Trait, and A-State 
on Errors 


The errors made by high A-Trait and 
low A-Trait subjects in the stress and non- 
stress conditions on Tasks A, B, and C 
were evaluated by a three-factor analysis of 
variance in which A-Trait, conditions, and 
periods were the independent variables. The 
only significant finding was the main effect 
of periods (F = 21.6, df = 2/120, p < .01), 
which indicated that the number of errors 
for all groups decreased from Task A to 
Task C. The absence of any statistically 
significant effects involving A-Trait and ex- 
perimental conditions indicated that per- 
formance on the learning task was not in- 
fluenced by these variables. 

In a previous study, O’Neil, Spielberger, 
and Hansen (1969) found no relation be- 
tween A-Trait and errors, but there was an 
interactive relation between A-State and er- 
rors which was consistent with predictions 
from Drive Theory. Therefore, the relation 
between errors and A-State was evaluated 
in the present study. Since level of A-State 
decreased during the experimental session 
(see Figure 2), performance was evaluated 
as a function of the A-State scores that 
were actually obtained on each task. Sub- 
jects with A-State scores above the median. 
on a task were designated as the high A- 
State group and those with A-State scores 
below the median were designated as the low 
A-State group; subjects whose A-State 
scores fell at the median were assigned to the 
smaller group. The median A-State scores 
obtained on Tasks A, B, and C were 12, 12, 
and 11, respectively. 

The mean number of errors per problem 
made by the high and low A-State groups 
were 3.5 and 2.9 on Task A, 2.3 and 1.1 on 
Task B, and 2.2 and .8 on Task C. The data 
for each task were analyzed by separate 
one-way analyses of variance in which level 
of A-State (high vs. low) was the independ- 
ent variable. A significant main effect of 
anxiety was found for Tasks B (F = 44, df 
= 1/64) and © (F = 112, df = 1/64), 
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,indieating that the high A-State Group 
made significantly more errors than the low 
A-State group on these tasks. 


Discussion 


In the present study, negative feedback 
about performance in the stress condition 
led to greater initial increments in A-State 
for high A-State subjects than for low A- 
Trait subjects. In contrast, the high A-Trait 
and low A-Trait subjects in the nonstress 
condition who were given nonevaluative in- 
formation about their performance showed 
parallel changes in A-State. Thus, only 
when stress was introduced through nega- 
tive evaluations of performance did the 
high A-Trait subjects show greater initial 
increments in A-State than the low A-Trait 
subjects. On the assumption that negative 
feedback was interpreted as more threaten- 
ing by the high A-State subjects, these re- 
sults are consistent with Trait-State Anxi- 
ety Theory (Spielberger, 1971) which pre- 
dicts that high A-State subjects respond 
with greater increments in A-State than low 
A-State subjects in situations that pose 
threats to self-esteem. 

During the task, the pattern of A-State 
changes differed for the high A-Trait and 
low A-Trait subjects in the stress condition. 
Despite the repeated negative feedback, 
level of A-State for high A-Trait subjects 
decreased from Task A to Task ©, while 
level of A-State for low A-Trait subjects 
was sustained at high initial levels. This 
unexpected finding suggests that high A- 
Trait subjects may adapt better to threats 
to self-esteem than low A-Trait subjects. 
Since high A-Trait college students experi- 
ence anxiety states more frequently than do 
low A-State students, it may be hypothe- 
sized that they are more likely to develop 
effective mechanisms for coping with such 
states. Such coping mechanisms could either 
help to reduce A-State directly or to change 
the subject’s perception of the stress pro- 
duced by the situation or both. 

The findings for the nonstress condition 
in the present study were comparable to the 
results reported by O'Neil, Hansen, and 
Spielberger (1969). In both studies, the 
high A-Trait and low A-Trait subjects re- 
acted to the task with initial increases in 
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A-State and showed similar decreases in 
A-State during the task. Furthermore, level 
of A-State for high A-Trait subjects was 
higher than for low  A-Trait subjects 
throughout the task in both studies, and in- 
erements in A-State were not related to A- 
Trait. Although the difficult materials pre- 
sented by the computer led to increases in 
A-State, the CAI task was apparently no 
more threatening to high anxiety-prone 
subjects than to the subjects who are not 
easily threatened. Thus, in the present 
Study, as in the previous one, when no ex- 
plieit threat to self-esteem was introduced 
there was no differential change in A-State 
for subjects who differed in A-Trait. 
Although the results for the A-State data 
of the nonstress condition of the present 
study were similar to those obtained by 
O’Neil, Hansen, and Spielberger (1969), 
very different results were found with re- 
-gard to the relationship between A-State 
and errors in the two studies. In the present 
study, the high A-State subjects made sig- 
nificantly more errors than low A-State 
subjects on Tasks B and C, but not on Task 
A. O'Neil, Hansen, and Spielberger (1969) 
found that high A-State subjects made sig- 
nificantly more errors than low A-State 
subjects on Task A, but not on Tasks B and 
C. Furthermore, the findings of the previous 
study were consistent with the prediction 
from Drive Theory that high A-State sub- 
jects would make more errors than low A- 
State subjects on more difficult learning ma- 
terials (many competing responses), an 
fewer errors on easier materials (fewer 
competing responses) whereas the results of 
the present study failed to support Drive 
Theory. As the task became progressively 
easier, the high A-State subjects made rela- 
tively more errors than the low A-State 
subjects. " 
One possible explanation of the inconsist- 
ent relation between A-State and errors 10 
the two studies is that O'Neil, Hansen, and 
Spielberger (1969) used male subjects while 
female subjects were used in the present 
study. A considerable body of extant litera- 
ture suggests that relationship between anx- 
iety and learning is different for men 4! 
women. Chapeau (1968), for example, 
found results consistent with Drive Theory 
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for females but not males. Katahn, in three 
studies of anxiety and learning, reported 
that: (a) females gave results more consist- 
ent with Drive Theory than males (Ka- 
tahn & Dean, 1964); (b) males gave more 
consistent results than females (Katahn & 
Branham, 1968) ; and (c) there were no dif- 
ferences in anxiety and learning for men 
and women (Katahn & Lyda, 1966). It 
may be noted that sex and anxiety interac- 
tions probably reflect specific situational 
variables which influence learning and 
whieh may have differential significance for 


. men and women. Consequently, caution is 


suggested in making generalizations con- 
cerning the relationship between anxiety, 
sex, and learning. 

In summary, the results of the present 
study support predictions from Trait-State 
Anxiety Theory that negative feedback 
constitutes a threat to self-esteem and thus 
is perceived as more threatening by high 
A-Trait than by low A-Trait subjects. How- 
ever, the results failed to support predic- 
tions from Drive Theory in that high Drive 
(high A-State) subjects made significantly 
more errors than low Drive subjects on the 
CAI task for which the overall error rate 
was low, but did not differ on a task for 
which the error rate was high. 
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THE FIRST LETTER MNEMONIC' 


DOUGLAS L. NELSON? ax» CYNTHIA STARK ARCHER 


University of South Florida 


The purpose of this study was to ascertain the efficiency of the first 
letter mnemonic as a memorization strategy. Single study and test 
trials on each of six lists were given by the method of serial recall. 
The first letters of the words of each list formed a vertical word, the 
mnemonic code word. Half of the subjects were informed about this 
list structure and were given the code word prior to each list presenta- 
tion, and the remaining half served as uninformed controls. Further- 
more, half of the subjects in each of these conditions were given the 
first letters as retrieval cues on the recall test and, for the remaining 
subjects, no cues were provided. The mnemonic significantly enhanced 
recall. The results of ancillary analyses suggested that this facilita- 
tion occurred because the mnemonic device preserved the relative 
positions of the words, not because first letters increased item avail- 


ability. 


Although mnemonic techniques designed 
to enhance the efficiency of memory have 
been known since ancient Greek times 
(Yates, 1966), only recently have they been 
scrutinized under controlled laboratory con- 
ditions. The bulk of this research has dealt 
with the effectiveness and the apparent im- 
portant components of the numeric-pegword 
system, which involves utilization of a 
number-word rhyme like one-bun, two- 
shoe, three-tree, etc. in memorizing a series 
of items. Recent findings have indicated 
that it appears necessary to construct 
images depicting unitizing interactions be- 
tween these items and the pegwords if effi- 
ciency of recall is to increase (e.g., Bower, 
1970). For instance, if the initial item was 
“scissors,” a large pair of scissors cutting 
through a hamburger bun might be imag- 
ined. Relative to other memorization strate- 
gies (e.g., verbal mediation and rote repeti- 
tion) this technique has been reported to 
produce up to seven-fold inereases in recall 
(e.g., Bower, 1970; Wood, 1967). 

Research on the potential efficiency of 
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another frequently used mnemonic device 
has been nonexistent. This technique can be 
referred to as the “first letter mnemonic.” 
Examples depicting this method are com- 
mon. The name “ROY G. BIV” is used to 
recall the colors of the visible spectrum. 
Students of physiological psychology recall 
the cranial nerves by remembering the non- 
sense sentence “On old olympus towering 
tops a fop and glutton vended some hops.” 
Electronic technicians recognize the numer- 
ical values of resistors by their colors by 
recalling “Beyond Brown’s rose orchard you 
glimpse blue violets growing wild.” Further- 
more, given a list of characteristics or prop- 
erties to be memorized, students often re- 
port arranging the items so that the first 
letters form a nonsense word. For example, 
the major categories of psychoneurotie Te- 
action as given by the American Psychia- 
triac Association and Statistical Manual 
are Anxiety, Dissociative, Conversion, Pho- 
bic, Obsessive, Depressive, and Psychoneu- 
rotic. In rearranging the order of these reac- 
tions the acronym “CADPOD” can be 
found which can be utilized as a simple 
mnemonic for facilitating the memorization 
of the classification system. i 

In each of these examples the mnemonl¢ 
includes the first letters of all items that 
must be remembered arranged so that, ! 
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necessary, proper sequencing is preserved as 
well. Thus, if the system does facilitate re- 
call, it may do so by providing the initial 
letters of the to-be-remembered stimuli at 
the time of recall. Evidence attesting to the 
utility of first letters as retrieval cues can 
be derived from studies showing that they 
represent reliable redintegrative cues for re- 
generating list words (e.g., Earhard, 1967; 
Horowitz, White, & Atwood, 1968), from 
studies indicating that initial relative to 
other letters play a differentially significant 
role in word processing (e.g., Nelson, Fos- 
selman, & Peebles, 1971; Nelson & Gar- 
land, 1969), and from the common observa- 
tion that attempts at abbreviation always 
incorporate first letters. However, any facil- 
itation produced by the mnemonic may also 
result because the system preserves order in 
Sequence information. In other words, the 
system proscribes which item falls in the 
first, second, ete. position. If the task re- 
quires knowledge of relative position, recall 
difficulty will be increased (e.g, Nelson, 
1969), and, since the mnemonic could pro- 
vide this knowledge, it may improve recall. 
The major purpose of this experiment 
was to determine if the first letter mne- 
monic does facilitate recall and, second, 
whether this facilitation results because 
first letters enhance item availability, be- 
Cause the device preserves relative position 
Sues, or both. Subjects studied and then re- 
Aled lists of words in the order in which 
ey were presented. The first letters of each 
of these lists formed a word. Half of the 
Subjects were informed about the structure 
of the lists, and were given the mnemonic 
uda prior to the presentation of each list; 
we remaining subjects were simply given 
instructions for serial recall. These two 
groups were subdivided so that the first let- 
T$ Were or were not present at recall. The 
oe of manipulating first letter availa- 
ility at recall was to ascertain if these cues 
d elevate the recall of the uninstructed 
Ontrols to the level of those utilizing the 
Mnemonic, 


MeEtHop 
Materia] 


The first letters ithin a given li 
4 of each word within a given list. 
Were all different and formed a sixletter word, 
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the mnemonie code word. For example, in one 
of the lists the first letters formed the word 
FLOWER: FIGMENT, LOYALTY, OPINION, 
WARMTH, EFFORT, RATING. Six lists were 
constructed in this manner with all words being 
selected from the Paivio, Yuille, and Madigan 
(1968) norms. Respective mean imagery, con- 
ereteness, and meaningfulness values were 3.43, 
2.21, and 5.31, respectively, for words within the 
lists; for the mnemonic code words these respec- 
tive values were 6.06, 6.62, and 6.81. Low imagery- 
concreteness items were used as list words on the 
assumption that they would make the task diffi- 
cult enough to ensure recall errors on these rela- 
tively short lists. Code words were high with 
respect to these characteristics assuming that they 
might be more easily differentiated from their 
respective list words. 


Procedure 


All of the subjects were given instructions for 
serial recall, indicating that the words had to be 
reproduced in the order in which they were pre- 
sented. Each word was typed in upper case letters 
and was automatically presented via a Lafayette 
memory drum Model 303 at a 2-second rate. Fol- 
lowing the sixth word in the list, a line of astericks 
appeared at which time the subject wrote as many 
of the words of the list as he could remember on 
a 3 X 5 inch sheet of paper, covering each word 
with a card as he wrote. There were six un- 
numbered, vertically placed lines on each sheet. 
When the subject finished writing, the experi- 
menter removed the recall protocol, replacing it 
with a new one, and the next list was shown. Thus, 
only a single study and recall trial were given on 
each list. 

‘As illustrated in Table 1, the principal variables 
of the experiment constituted a 2 X 2 between- 
subject factorial design. In addition to instructions 
for serial recall, subjects in the mnemonic-present 
condition were told that the first letters of the 
words of each list consisted of a vertical six-letter 
word and that remembering it would facilitate 
recall. Prior to the presentation of a given list, 
the experimenter told the subject its mnemonic 
code word. Furthermore, these subjects were in- 
formed by the general instructions that, as an addi- 
tional memory aid, the first letters would also be 
provided in their appropriate positions on the 
recall sheet. Thus, the appropriate first letter 
appeared on each blank on the recall sheet. Sub- 
jects in the mnemonic-absent condition were in- 
structed about and were provided with the 
mnemonic code words, but the first letters were 
absent on the recall sheet. In the control-present 
condition, the subjects were given the instructions 
for serial recall and were informed that the first 
letter of each list word would be provided in its 
correct position as & recall aid. Obviously, the 
subjects in this condition might have discovered 
that the first letters formed a vertically placed 
word but, according to postexperimental inter- 
views, none apparently did. Finally, in the control- 
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absent condition, subjects were given instructions 
for serial recall with no reference to the structure 
of the lists and without the provision of first letters 


on the recall sheet. 

Each subject received six lists under the con- 
dition to which he was assigned, the particular list 
sequence being determined by 6 X 6 Latin squares. 
Four different, randomized 6 X 6 squares were 
constructed, one for each of the four principal con- 
ditions of the experiment. Thus, conditions repre- 
sented a between-square variable, with ordinal 
position (columns) and lists (Latin letters) as 
within-square variables. 


Subjects 


Three different subjects were assigned to each 
of the six rows within each square, making a total 
of 18 subjects in each principal condition and a 
total of 72 subjects in the experiment. The sub- 
jects, who received course credit in introductory 
psychology for their participation, were assigned 
to conditions in blocks of 24, with one subject 
from each condition and row per block. Assignment 
within blocks was determined by a table of random 
een A single female experimenter collected 
all data. 


RESULTS 


Since the experimental design was a vari- 
ation of Plan 8 as described by Winer 
(1962, p. 549), the partition of the compo- 
nents of variance in the initial analysis was 
in accordance with his suggested break- 
down. Thus, statistical analysis of errors on 
the six lists (maximum of 36) indicated 
that instructions was not reliable, but that 
first letter availability (F = 18.57, df = 
1/68) and the Instructions x First Letter 
Availability interaction (F = 7.37, df = 
1/68) were significant at p < .01. Table 1 
presents the means for this interaction. A 
Fisher’s Least Significant Difference of 1.09 
indicated all the means were reliably differ- 
ent from each other Thus, fewest errors 
were obtained in the control-present condi- 


TABLE 1 


Mezan ERRORS AS A FUNCTION OF INSTRUCTIONS 
AND FIRST LETTER ÁYAILABILITY AT 


RECALL 
Availability of UH 
first letters 
Mnemonic Control 
Present 11.50 9.17, 
Absent 13.00 15.78 
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TABLE 2 


Mean Errors AS A FUNCTION OF Instructions 
AND SERIAL POSITION 


ai 


Serial position A 

Condition 3 
1 2 3 4 5 eu 
Mnemonic | .92 | 2.19 | 2.22 | 2.53 | 2.23 | 2.1 | 
| 

Control .28 | 1.25 | 2.33 | 2.75 | 2.97 2.89 


! 

| 
tion, when subjects were not instructed | 
about the nature of the mnemonic code — 
word but were provided with the first letters —' 
of the list words on the recall test. Next 
fewest errors occurred in the mnemonic con- 
ditions, when subjects were instructed about | 
the code work, with slightly but signifi — 
cantly fewer errors obtained when first lets — 
ters were present relative to absent. Great- 
est difficulty was encountered in condition” 
control-absent in which no instructions re- 
garding the code word were provided and 
first letters were unavailable on the recall 
test. The only other significant source of 
variance in this analysis was ordinal-posi- — 
tion (F = 5.10, df = 5/320, p < .01). Mean 
errors for Positions I-VI were, respectively, 
1.49, 2.25, 2.08, 2.15, 2.21, and 2.19. An LSD 
of .36 indicated that fewest errors were ob - 
tained on Position I relative to all other 
positions, which did not differ from each 
other. The main effect of lists and all avail- 
able interactions between the principal cone. 
ditions of the experiment and ordinal-posi- 
tion, and between these sources and lists, | 
were not statistieally reliable. E | 

To determine if the mnemonic conditions | 
influenced the form of the serial position | 
error curve, a second analysis of variance. 
was performed. In this analysis it was nec | 
essary to collapse across lists and ordinal” 
position. The pattern of significant sources: 
for the between-subjects variables, instruos 
tions and first letter availability, was ident- 
ical to that reported for the previous analy- 
sis. In addition, serial position (F = 12.88) 
df = 5/340) and Instructions x Serial Po- 
sition, (F = 6.30, df = 5/340) were signifi 
cant at p < .01. No other interactions WIW 
serial position were reliable. The means fo 
the reliable interaction are shown in Table 
2. Fisher's LSD was .56. In the mnemonit 
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conditions recall errors significantly in- 
creased from Position 1 to 2 and did not 
vary reliably thereafter. In the control con- 
ditions the more typical bowed curve was 
obtained. Thus, the principal effect of mne- 
monic instructions was to flatten the usual 
serial position curve beyond the initial 
item. It is also interesting to note that the 
facilitation produced by the mnemonic in- 
structions at the end of the list came at the 
cost of increased errors on Positions 1 and 2. 
The comparison of errors for the mne- 
monic and control conditions when first let- 
ters were absent indicated that the first let- 
ter mnemonic significantly enhanced recall. 
To determine if this effect should be attrib- 
uted to enhanced item availability or to 
preservation of order information the recall 
protocols were rescored using a lenient 
Scoring criterion. In order to be scored as 
correct an item had to be recalled, but not 
necessarily in its proper position as in the 
previous analyses. This procedure allowed 
an evaluation of item availability inde- 
pendent of the recall of appropriate posi- 
tion-in-sequence information. If the superi- 
ority of the mnemonic-absent relative to 
the control-absent conditions remained un- 
altered under this rescoring, this result 
would suggest that the facilitation was pro- 
duced by first letters enhancing item avail- 
ability. If the difference was eliminated by 
differential recall increases for the control 
condition, this finding would suggest that 
items were equally available in both condi- 
tions and that the facilitation effect as- 
ctibed to the mnemonic resulted because 
thig technique preserved order-in-sequence. 
Respective mean errors for the mnemonic 
ri the control were 12.83 and 11.05. Al- 
ough items appeared to be slightly more 
Available within the control condition, an 
analysis of variance indicated that this dif- 
ference was not reliable (F = 1.77, df = 
y 34). Thus, at least under the present con- 
rions, the first letter mnemonic appeared 
© facilitate recall because it provided rela- 
ive position cues. The attenuation of the 
dio] Serial position effect obtained within 
n mnemonic conditions may therefore 
W Ve resulted because position information 

as inherent in the mnemonic code. 
1 analysis of the extra-list intrusions 
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also suggested that first letters did not pro- 
vide salient retrieval cues in this task. Sig- 
nificantly more intrusions were obtained in 
the mnemonie (4.89) than in the control 
(1.11; F = 2848, df = 1/84, p < .01). 
Virtually all of the intrusions in the mne- 
monic condition shared initial letters with 
the appropriate item. Furthermore, the ma- 
jority of the intrusions in this condition 
were relatively abstract, for example, Men- 
tion for Method, Expect for Effort. Thus, 
although the first letters appeared to prime 
appropriate items they apparently were not 
useful for priming specific list items. 


Discussion 


The findings of this experiment suggest 
that the first letter mnemonic represents an 
efficient memorization strategy. Despite the 
rapidly paced procedures, despite the fact 
that mnemonic code words were supplied by 
the experimenter instead of being the sub- 
ject generated and despite the fact that 
subjects in these conditions had an “extra” 
word to remember, performance in each 
mnemonic condition is superior to that for 
the control. The lack of reliable interactions 
with ordinal-position suggests that this fa- 
cilitation begins on the first list and is main- 
tained throughout the practice session. 

The slightly superior performance within 
mnemonic-present relative to mnemonic- 
absent appears to have resulted from sev- 
eral subjects in the latter condition forget- 
ting or not using the code word. For nine of 
the subjects in the mnemonic-absent treat- 
ment the vertically placed code word was 
written down for all six lists, even though 
there were blanks next to one or more of the 
letters. For the remaining nine subjects this 
word was available for an average of 3.6 
lists. This difference may only reflect differ- 
ences in strategy, since failing to record the 
code word does not necessarily mean that it 
was not utilized. However, subjects indicat- 
ing that the code word was present for each 
list made an average of 10.89 errors whereas 
the remaining subjects incurred an average 
of 15.11 errors. 

The lenient scoring, the serial position 
curve findings and the data on extra-list 
intrusions suggest that the facilitation effect 
produced by the mnemonic resulted because 
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the code word provided the relative order- 
ing of the items within the list, not because 
first letters acted as useful redintegrative 
cues: for regenerating list words and, hence, 
by increasing item availability. However, it 
should be recalled that the items within the 
lists were not only abstract but, semanti- 
cally heterogeneous as well. When items are 
highly interrelated, as are the categories of 
psychoneurotic reaction to the student of 
abnormal psychology, then first letters may 
serye as more useful retrieval cues. Essen- 
tially, with high interrelatedness among the 
words there are a greater number of con- 
straints operating on the relevant set of 
plausible alternatives through which the 
search for the correct item might be con- 
ducted. Presumably, the smaller the size of 
the relevant set the greater the likelihood of 
retrieving the correct item. This possibility 
remains to be tested. 

_ Contrary to expectation, providing initial 
letters at recall for the uninstructed con- 
trols not only elevated their performance to 
the level of those provided with the mne- 
monic, but this procedure resulted in the 
highest level of recall performance. Appar- 
ently, the availability of the mnemonic 
code word before and during list presenta- 
tion did not compensate for the greater 
memory load produced by having an 
“extra” word to remember. As in the mne- 
monic conditions, availability of first letters 
at recall provided appropriate relative or- 
dering cues. However, subjects in this con- 
dition were not burdened by the necessity 
of processing the mnemonic code word dur- 
ing acquisition. Comparison of the serial 
position curves for the mnemonic and the 
control conditions indicates that the availa- 
bility of the code word during acquisition 
tended to inflate errors at the beginning of 
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the list. Thus, the presence of the mnemonic 
code word tended to reduce the primacy ef- 
fect typically obtained in serial recall. The 
reasons for this differential effect are not 
clear at this time, but one possibility is that 
the mnemonic code word was being re- 
hearsed along with the first items in the list 
as these initial items were being shown. Re- 
hearsal of the code word at that time would 
differentially interfere with the rehearsal of 
the items at the beginning of the list. This 
interpretative speculation assumes that the 
code word was not rehearsed throughout the 
list presentation, but only initially. 
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The ability of kindergarten children to visually differentiate the 
graphemic structure of words was examined using four different 
orthographies. Children were asked to assemble a series of words using 
graphemic units corresponding to the phonemic structure of the words. 
The words were presented in the traditional orthography, initial teach- 
ing alphabet (i.t.a.), and two additional orthographies designed to in- 
corporate graphic cues pertaining to the graphemic structure of the 
words. It was hypothesized that children would be more able to cor- 
rectly assemble words when presented in orthographies providing 
graphic cues than in the traditional orthography. This hypothesis was 


supported and the implications concerning the effectiveness of the 
i.t.a. versus other orthographies were dii 3 


Downing (1967) and Downing and Jones 
(1967) have provided evidence to indicate 
that the traditional orthography is not as 
effective as the initial teaching alphabet 
(it.a.) in teaching children to read. Downing 
(1969) suggested that the superiority of the 
ita. lies in its facilitation of the perception 
of linguistic structure. Specifically, Downing 
Suggests that the i.t.a. provides increased 
Correspondence between the graphic units of 
Writing (graphemes) and the phonemic 
Units of language, and facilitates the visual 

erentiation of these graphemes within 
Words. The traditional orthography, on the 
other hand, consists of variable grapheme- 
Dhoneme associations as exemplified by the 
fact that approximately 40 phonemic sounds 
are represented by only 26 letters. 
, That grapheme-phoneme correspondence 
is an important factor in reading skills has 
een shown in several studies. Gibson, 
Pick, Osser, and Hammond (1962) and 
——— 
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Gibson, Osser, and Pick (1963) have found 
that words in which there is an invariate 
grapheme-phoneme correspondence are more 
more easily recognized tachistoscopically 
than words in which there is no such cor- 
respondence. Several studies (Fraunfelker & 
Spear, 1969; Underwood & Schultz, 1960) 
have found that the pronunciability of 
words is an important factor in paired- 
associate learning. Bishop (1964) and Jeffrey 
and Samuels (1967) have shown that letter 
training, in which the subject learns to 
associate a graphic symbol with its spoken 
sound, is superior to the whole-word tech- 
nique in learning to read new words. " 
The idea that perceptual training facili- 
tates reading development has also been 
supported in several studies. Elkind and 
Deblinger (1969) found that children given 
perceptual training displayed significantly 
higher pretest-posttest increases on & num- 
ber of reading tasks than a group of children 
not given any perceptual training. Peters 
(1966) found that for those children taught 
to read with the i.t.a., spelling errors in 
traditional orthography were not only 
reduced but were more systematic. Children 
taught to read with the traditional orthog- 
raphy made more unsystematic errors sug- 
gestive of faulty perceptual training. 
Although the evidence strongly suggests 
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the importance of grapheme-phoneme cor- 
respondence in learning to read, there are 
some findings to the contrary. Blanton and 
Nunnally (1967) and Gibson, Shurcliff, and 
Yonas (1970) have found that deaf children, 
presumably unaffected by pronunciation 
differences in words, also recognized words 
with high pronunciability more readily than 
those of low pronunciability. Gibson (1965) 
also cites several studies (Levin, Baum, & 
Bostwick, 1963; Levin & Watson, 1963) in 
which training with orthographies con- 
taining perfect grapheme-phoneme cor- 
respondence did not facilitate transfer to a 
more complex orthography in which the cor- 
respondence was more variable. Gibson 
(1970) still feels however, that children search 
for invariant units within words when 
learning to read and that instructional 
programs should be established that will 
facilitate perceptual search for these in- 
variant units. Downing (1969) is suggesting 
that the i.t.a. provides this kind of instruc- 
tion. 

'The present study investigates whether 
the i.t.a., as well as two other revised or- 
thographies, facilitate the perception of 
linguistic structure. It may be argued that 
the i.a. allows for a more efficient dif- 
ferentiation between graphemic units in that 
when letters are combined to form separate 
graphemes they are connected graphically, 
and that when single letters comprise a 
single grapheme they are not joined with 
any other letter. The traditional orthog- 
raphy supplies no such graphic cues to 
indicate which letters are combined into 
single graphemes (subsequently called 
“digraphs”) and which letters remain 
distinct. For example, in the word “that,” 
the letters t and h are combined into one 
digraph (th) which corresponds to a distinct 
phonemic sound. In the i.t.a., the t and the 
h are joined together (th), thus facilitating 
the perception of these two letters as being a 
digraph. In traditional orthography, there is 
nothing to indicate that these two letters 
form a digraph, this concept having to be 
learned in a rote fashion. Therefore, it is 
generally hypothesized that orthographies 
which provide such graphic cues will facili- 
tate the perception of digraphs within a 
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word, and will facilitate the synthesis of 
these'digraphs and other letters into whole 
words, 


METHOD 


Subjects 


Twenty males and 20 females, ranging in age 
from 56 to 73 months, were selected as subjects in 
the experiment. The children were drawn from two 
kindergarten classes in the Victoria area. All of the 
females in the two classes were used as subjects 
but since there was a larger number of males than 
required, the boys were selected at random. The 
subjects were randomly assigned to one of the four 
treatment conditions so that there were 10 in each 
condition. Due to the randomization procedure, 
there was a resultant unequal number of boys and 
girls in each treatment condition. Because of this, 
special techniques for analyzing the data were 
employed as noted below. 


Stimulus Materials 


Two lists of four-letter words comprised the 
stimulus material. The words on each list were 
printed on separate pieces of posterboard using 
four different orthographies. The traditional 
orthography and the i.t.a. were used in two of the 
orthography conditions. Two other orthographies 
with graphic cues were also devised. An under- 
lining orthography placed a line beneath the 
digraph contained within each word, while a 
spacing orthography separated the graphemes 
within a word with double spaces. One list com- 
prised the words in the training task while the 
second list comprised the words in the transfer 
task. The words, printed in each type of orthog- 
raphy, are presented in Table 1. As can be seen, 
the words in the transfer task are made up of the 
same graphemic units as the words in the training 
task except that they are in different positions 
within the words. 

For each of the words on both lists, the letters 
contained in them were printed on separate 9 ? 
inch pieces of posterboard. In addition, all possible 
sequential combinations of the letters in the words 
were also printed on identically sized pieces of 
posterboard. For example, for the word "ship 
the letters s, h, i, and p as well as the combinations 
sh, shi, hi, hip, and ip were printed. In this way, 
the whole word could be put together in a number 
of different ways only one of which was correct 
according to the rules of grapheme-phoneme 
correspondence. 


Covariates 


In order to control for initial differences m 
prior reading experience, the subjects Were 109) ] 
ministered portions of the Clymer-Barrett (9 di 
Prereading Battery individually. Each subject 
was given the first 16 items of Subtest 1, the 7o09% 
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nition of letters task, and all 20 items of Subtest 2, 
the matching of words task. Both of these subtests 
comprised the visual discrimination portion of the 
battery. This portion seemed appropriate for 
control purposes since the experimental task was 
essentially of & visual discrimation nature. Only 
the first 16 items of Subtest 1 were used since they 
dealt with recognition of lower-case letters, the 
nature of the stimuli in the experimental task. 


Procedure 


Approximately 1 week after the administration 
of the prereading battery, each of the subjects was 
exposed to the experimental situation. The subject 
entered the experimental room and sat at a desk 
beside the experimenter. Laid out on the table was 


‘an example word and all of its parts. The task was 


introduced as being similar to putting together a 
picture puzzle. The subject was told that he would 
beshown a picture of a word and that his task was 
to take some of the parts provided and put them 
together to make a picture of that word. At this 
point, the experimenter demonstrated the task 
by putting together the example word. Each 
subject was told that the word could have been 
constructed in another way but that there was 
only one correct way of doing it. The subject also 
was told that there was a rule which, if learned, 
would enable him to put together correctly all the 
words he would see. 

The subject then was asked to construct the 
first of the four training words. On the first trial, 
he attempted to put the word together without an 
explanation of the rule. If he was correct, the rule 
was demonstrated to him and it was pointed out 
that he had been correct because he had followed 
the rule. If the subject constructed the word 
incorrectly, he was shown the correct way again 
With a demonstration of the rule. The parts of the 
Word were then shuffled and laid out before the 
Subject who again tried to construct the word 
Teu. He was told to keep in mind the rule he 

ad just been shown as it would show him how to 
jberact all the words. This procedure was con- 
inued until the subject had correctly constructed 
the words in the training task. The words were 
Presented in random order for each subject. 
org rplanation of the rule depended upon which 
sil ography condition the subject was in. 
m 2 were stated verbally as well as demonstrated 
thet the word the child was putting together at 
Es time. The subjects in the i.t.a. condition 
ye told that “only when two letters are joined 
Ogether do they go together" to make a single 
Füpheme in the word. Those in the spacing con- 
vies were told that the different parts of 
lini were separated by large spaces. In the under- 
pud condition, subjects were told that only those 
ces that were underlined ‘went together" to 
dubis Up a separate part of the word. Finally, the 
Ne jects in the traditional orthography condition 
nee told that two letters within each word always 
whint together” and that they had to remember 
ich two letters these were. 
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TABLE 1 
TRAINING AND TRANSFER WORDS PRINTED IN THE 
INITIAL TEACHING ALPHABET, SPACING, 
UNDERLINING, AND TRADITIONAL 
ORTHOGRAPHY CONDITIONS 


Orthography condition 
Words 
ita. | Spacing |Underlining Traditional 

Training 

thin thin | thin | thin | thin 

ship Ship | ship | ship ship 

oily oily | oily oily oily 

chop chop | chop | chop chop 
Transfer 

lich lih | lich | lich lich 

posh poSh | posh posh posh 

loin loin | loin loin loin 

pith pith | pith | Pith | pith 


When the subject had correctly constructed the 
words in the training task, he was shown the trans- 
fer words in a random order. The child was given 
the words only once and was asked to put them 
together correctly. He was told that the same rule 
applied as in the previous words and that if he had 
learned this rule (or if he had learned the correct 
letter combinations in the traditional orthography 
condition), he would be able to put all the words 
together correctly. No feedback was given con- 
cerning whether any transfer construction was 


correct or not. 


Dependent Measures 


Two measures were obtained on the training 
words. The total number of trials taken to cor- 
rectly construct the four words was recorded along 
with the total number of errors over these trials. 
On the transfer words, three measures were ob- 
tained: (a) the total number of words constructed 
correctly; (b) the total number of errors made on 
the diagraphs (Type A errors); and (c) the total 
number of errors made on the remaining single 
letters in the word (Type B errors). This distinc- 
tion was made in order to test whether the type 
of orthography facilitated the differentiation 
between all parts of the word or whether it just 
facilitated the discrimination of the digraphs. 


Experimental Predictions 


It was hypothesized that (a) the number of trials 
taken to correctly construct the four training 
words, (b) the number of errors on the training 
words, and (c) the number of Type A and Type B 
errors on the transfer words would all be lower in 
those orthographies providing graphic cues (i.t.a., 
spacing, and underlining) than in the orthography 
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providing no such cues (traditional orthography). 
It was also hypothesized that the number of cor- 
rect constructions on the transfer words would be 
higher for those orthographies providing graphic 
cues than for traditional orthography. 


RESULTS AND Discussion 


A test of the homogeneity of covariance 
was found to be significant; age, as a co- 
variate, was not distributed randomly 
between sexes. In view of the difficulties 
arising when treatment effects and covari- 
ates are not independent of each other 
(Evans & Anastasio, 1968), the three co- 
variates were dropped from further analyses. 
Further justification for this is provided by 
the finding that the covariates were not 
significantly related to any of the dependent 
measures and therefore were actually adding 
variance to the error term rather than re- 
ducing unaccounted for variance. 

In the present study, there were an un- 
equal number of males and females across 
orthography conditions. Due to the problems 
inherent in analysis of variance with an 
unequal n (Rosenblood & Lange, 1971), a 
conservative test of the Sex X Orthography 
interaction was performed and found to be 
nonsignificant. As a result, the main effects 
of orthography were subsequently analyzed 
with data from the male and female subjects 
combined. This provides for a more con- 
servative test of the effects of orthography 
but since there were 10 subjects per orthog- 
raphy in the present study, the problems 
associated with an unequal n solution are 
alleviated. 


TABLE 2 
Maan NUMBER or LEARNING TRIALS, ACQUISITION 
Errors, Correct TRANSFERS, AND 
Type A AND Type B ERRORS IN THE 
INITIAL TEACHING ALPHABET, 
SPACING, UNDERLINING, AND 
TRADITIONAL ORTHOGRAPHY 


Conprtions 
Under. | tional 
i ler- 

Measure ita. | Spacing | ‘ining ortho 

graphy 
Learning trials 6.2 | 6.7 7. 
Acquisition errors | 4.4 | 5.3 5.2 
Correct transfers | 3.8 3.0 7 
Type A errors 6 E 3.2 
Type B errors 5 1.4 5.1 
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Means for the five dependent measures in 
each orthography condition are presented in 
Table 2. To test the experimental hy- 
potheses, an a priori multiple comparison 
between the three orthographies providing 
graphic cues, and traditional orthography 
was performed on the dependent measures. 
A multivariate F test, using Wilk’s lambda 
criterion (Morrison, 1968), was computed to 
determine whether this comparison was 
significant for the dependent measures 
taken as a set. This test was significant 
(F = 14.43, df = 1/5, p < .001) indicating 
that the set of measures was able to signifi- 
cantly discriminate between those orthog- 
raphies providing graphic cues and the 
orthography without cues (traditional or- 
thography). 

Examining the univariate comparisons, 
only three of the five dependent measures 
showed significant differences between the 
cued and noncued orthographies. Both of 
the measures on the training words, the 
number of trials taken to correctly con- 
struct the words, and the number of errors 
over these trials, were not significantly 
different between the orthographies with 
and without cues. These findings demon- 
strate that the provision of graphic cues did 
not facilitate grapheme differentiation 
during training. The children learned to 
correctly construct the words in approxi- 
mately the same amount of time regardless 
of the orthography used in training. The 
children also produced approximately the 
same number of errors on these words. 

One conclusion that could be drawn from 
these findings is that the introduction of 
graphic cues does not facilitate grapheme 
differentiation. This equality of perfor- 
mance on the training words, however, may 
also have been due to the training procedure 
itself in which there was a very short time 
period between the experimenter’s demon- 
stration of the correct construction of 8 
particular word and the subject’s next 
attempt to do so. A correct construction 2 
this case may have been due to the subject 
having learned, in a rote fashion, to mato! 
the experimenter's performance. aoc 

The real test of whether the provision 9 
cues facilitates grapheme differentiation 18 
whether differences occur on transfer. 1° 
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tests performed on the three transfer meas- 
ures were all significant, each supporting 
ihe experimental predictions. The mean 
number of correct constructions on the 
transfer words was significantly greater for 
the combined orthographies with graphic 
cues than for traditional orthography (F = 
24.60, df = 1/36, p < .001), this difference 
accounting for 39.2% of the variance. In 
addition, the mean number of Type A and 
Type B errors on the transfer words were 
both significantly lower in the combined 
orthographies with graphic cues than in 
traditional orthography (F = 62.30, df = 
1/36, p < .001, and F = 19.30, df = 1/36, 
p < .001, respectively). The percentage of 
variance accounted for by these differences 
was 63.2 and 32.0, respectively. These re- 
sults indicated that the provision of graphic 
cues did facilitate the differentiation of 
graphemes within words. 

Pair-wise comparisons using Tukey’s 
HSD test (Kirk, 1968) revealed the 
following: the mean number of correct 
constructions on the transfer words in the 
i.t.a., spacing, and underlining orthographies 
were all significantly greater (p < .05) than 
in the traditional orthography condition; 
the mean number of Type A errors on the 
transfer words in the i.t.a., spacing, and 
underlining conditions were all significantly 
lower (p < .01) than in the traditional 
orthography condition; and the mean 
number of Type B errors were significantly 
lower (p < .01) in the i.t.a. and spacing 
conditions than in traditional orthography, 
but the difference between traditional 
orthography and the underlining condition 
Was not significant. 

The multiple comparison between the 
two specially devised orthographies (spacing 
and underlining) and the ita. was also 
tested and found to be nonsignificant for 
all transfer measures. The difference between 
the mean number of Type B errors in the 
Specially devised orthographies and the i.t.a. 
aPproached significance however (F = 3.24, 
df = 1/36, p < .08) with the i.t.a. resulting 
In fewer Type B errors. 

In general however, the three orthog- 
Taphies providing graphio cues are very 
tial not only in their relationship to 

aditional orthography, but also to each 


other. These orthographies may. therefore 
be posed as a possible solution to two 
problems involved in teaching children to 
read. It has been shown that letters contain 
distinctive features which allow children to 
differentiate between them (Gibson, Gibson, 
Pick, & Osser, 1962; Gibson, Osser, Schiff, & 
Smith, 1963). But groups of letters in ad- 
dition to single letters may represent 
phonemes. The present results suggest that 
the superiority of the ita. in teaching 
reading may be due, as Downing (1969) has 
suggested, to the provision of features or 
cues which aid children to differentiate the 
digraphs, as well as the single letters, in a 
word. The problem of grapheme differen- 
tiation also appears to be solved comparably 
well with the use of orthographies which 
retain the English alphabetic characters 
augmented by special cues. 

A second problem that may be alleviated 
by the spacing and underlining orthographies 
is the later transfer which children have to 
make to the traditional orthography after 
having learned to read with the i.t.a. 
Downing (1967) has shown that children 
sometimes experience difficulties making 
this transfer. This difficulty appears to be 
due to the childrens' inability to recognize 
the similarity between words printed in the 
it.&. and their counterparts in traditional 
orthography. Since the present study 
essentially found no differences between the 
i.t.8. and either of the revised orthographies, 
and since the latter two orthographies are 
more comparable to traditional orthography 
than to the i-t.a., it d Pea 

ecially augmented orthographies retaining 
the English characters be used in teaching 
children to read in order to eliminate this 
transfer problem. Research related to this 
problem is now in progress. 
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EFFECTS OF COMPARISON LEVEL FEEDBACK ON 


CLASSROOM-RELATED VERBAL 
LEARNING PERFORMANCE’ 


C. R. SNYDER? 
Vanderbilt University 


Subjects were placed in a classroom environment, and learned complex 
verbal material chosen because of its potential similarity to material 
in psychology courses. While learning the material, subjects received 
information regarding the performance of previous subjects on the 
same task. Subjects receiving low comparison level feedback (a 
standard of performance achieved at the seventeenth percentile by 
previous subjects) performed significantly better than those receiv- 
ing high comparison level feedback (a standard of performance 
achieved at the eighty-third percentile by previous subjects), who in 
turn performed significantly better than subjects receiving no com- 
parison level feedback. The effects of the comparison level feedback 


upon the classroom-type performance may have been mediated by 
differences in attention for the three comparison level groups. Implica- 
tions for the classroom learning process were disc 5 


Many psychological and sociological 
theories have emphasized the importance of 
the individual comparing himself to others. 
For example, the comparison process is uti- 
lized in the following theories: Festinger’s 
(1954) social comparison, Helson’s (1964) 
adaptation level, Homan’s (1961) distribu- 
tive justice, Lenski’s (1954) status crystal- 
lization, Merton’s (1957) reference group, 
Rotter's (1954) minimal goal level, Stouf- 
fer’s (Stouffer, Suchman, Devinney, Star & 
Williams, 1949) relative deprivation, and 
Thibaut and Kelley’s (1959) comparison 
— 
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level. Generally, all of these theories exam- 
ine the effect that comparison levels have 
on the individual's self-concept (i.e., posi- 
tive and negative affective experience). The 
major purpose of the present study, how- 
ever, was to ascertain the effects of compar- 
ison level feedback upon performance on a 
classroom-related verbal learning task. The 
study sought to expand previous research 
by noting affective feelings, and more im- 
portantly, performance on a learning task 
as a function of receiving comparison level 
feedback. 

Previous research in our laboratory 
(Snyder & Katahn, 1970, 1972) examined 
the effects on performance of receiving com- 
parison level information regarding the per- 
formance of previous subjects on the same 
task. In the original study, Snyder and Ka- 
tahn (1970) investigated the effect on per- 
formance of feedback consisting of high, 
average, and low scores, each presented to 
different groups of subjects as the “mean” 
attained by subjects in a previous experi- 
ment. Thus, while all subjects thought that 
the feedback consisted of the mean per- 
formance scores of previous subjects, it ac- 
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tually consisted of the performance at- 
tained at the eighty-third, the fiftieth, or 
the seventeenth percentile by subjects in a 
previous experiment. Under these condi- 
tions, moving from the high, to average, to 
low comparison levels, there was a trend for 
performance on the classroom-related ver- 
bal learning task to improve. In a modified 
replication of the original study, Snyder 
and Katahn (1972) correctly informed sep- 
arate groups of subjects that they were re- 
ceiving a high (the eighty-third percentile of 
previous subjects), an average (the fiftieth 
percentile of previous subjects), or low (the 
seventeenth percentile of previous subjects) 
comparison level. Results for this modified 
replication showed a strong trend to improve 
(p < .001), moving from the high, to aver- 
age, to low comparison levels for perform- 
ance on the classroom-related verbal learn- 
ing task. Neither of the previous studies, 
however, contained a control group receiv- 
ing no comparison level information. By 
noting the performance of the high and low 
comparison groups with respect to the no 
comparison control group, it may be ascer- 
tained whether these comparison levels en- 
hance or interfere with performance. There- 
fore, the first and prinicipal purpose of the 
present study was to examine the effect on 
the classroom-related experimental per- 
formance of correctly informing subjects 
that they were receiving a low, high, or no 
comparison level. 

There are two reasons for including a 
measure of ongoing positive and negative 
affect in the present study. First, it can be 
noted whether ongoing affect relates to ex- 
perimental classroom-related performance. 
To date, two reported studies have exam- 
ined the relationship of self-reported affect 
at the time of the task to performance on 
that task. Katahn, Snyder, and Durlak 
(1970) measured self-reported affective 
changes during performance in serial learn- 
ing noting that the greater the degree of 
ongoing negative affect the poorer the per- 
formance. Furthermore, using the same 
complex verbal learning task employed in 
the current study, Snyder and Katahn 
(1970) reported that ongoing positive affec- 
tive experience correlated positively, and 
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ongoing negative affective experience corre- 
lated negatively with performance. Second, 
it ean be noted whether the effects of the 
comparison level feedback upon classroom- 
related learning are mediated by self-re- 
ported affects. Unlike the previous research 
mentioned above, which concentrated upon 
the affective consequences of the comparison 
process, the emphasis in the present study 
was to utilize ongoing affective experience in 
order to clarify the theoretical network re- 
lating the effects of the comparison level 
feedback upon classroom-related experi- 
mental performance. Overall, then, the sec- 
ond purpose of the present study was to 
examine the relationships of ongoing affect 
to (a) experimental classroom-related per- 
formance, and (b) the effects of comparison 
level feedback upon experimental perform- 
ance. 

The complex verbal learning task utilized 
in this study was chosen because of its po- 
tential similarity to the learning process in 
the typical classroom situation. In order to 
generalize from the experimental task to the 
classroom learning situation, performance 
on the experimental task should correlate 
highly with performance in actual class- 
room learning (Katahn & Branham, 1968). 
Performance on the complex verbal learning 
task used in the current study has corre- 
lated as strongly with total grade point av- 
erage as have verbal SAT scores (Branham 
& Katahn, 1969). Furthermore, the complex 
verbal learning task correlated significantly 
with the subjects’ semester grade point av- 
erages (r = 43, p < .01) and with the 
subjects’ grades in introductory psychology 
(r = 42, p < .01; Ray, Katahn, & Snyder, 
1971). For all of these reasons, the experi- 
mental task used in the current study pro- 
vided a fairly good analogue for making 
inferences to the actual classroom learning 
situation. 


MeEtTHOD 
Subjects 
Two hundred and fifty-one male pv 
participated in the experiment as part of 


requirements for their introductory psychology 
courses at Vanderbilt University. 


A ——— 
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Complex Verbal Learning Task 


The learning task developed by Katahn and 
Branham (1968) consisted of 16 items covering 
Hullian learning theory, and 14 questions based on 
the information items. For example, one informa- 


tion item was: 


Conditioning is the process by which a stimu- 
lus comes to elicit a response it did not 
formerly elicit. The related question is: 
is the process by which a stimulus comes to 
elicit a response it did not formerly elicit. 


Similarly, all questions leave a blank for the 
appropriate answer on each trial. The information 
and question items took up two lines of space on 
slides and were presented visually on & screen at 
the front of a classroom in the psychology depart- 
ment. 


Experimental Procedure 


The 251 subjects were randomly assigned either 
a high (N = 83), low (N = 83), or no (N = 85) 
comparison level received condition. The com- 
parison level information appeared at the top of 
the answer sheet for each of the three trials. The 
high comparison level received condition consisted 
of the number of correct responses for the pre- 
vious highest 17% of the Vanderbilt subjects. The 
low comparison level received condition consi 
of the number of correct responses for the previous 
lowest 17% subjects. The no comparison level 
received condition consisted of no information at 
the top of the answer sheet for each trial. Groups 
of from 6 to 10 subjects were administered the 
task in a classroom. Hach subject had a booklet 
with an answer sheet for each of the three trials. 
All subjects were required to complete a short 
self-report scale at the beginning of each trial. The 
self-report scale consisted of the words attentive, 
shy, scornful, angry, joyful, ashamed, fearful, dis- 
Couraged, and irritated. The subject could vary 
his response from 1 (little) to 9 (strong), denoting 
e degree to which each word descriptive of his 
Syperience at that moment in the experiment. 
to ese affects represent the words with a high fac- 
r loading for each of the nine factors of the Dif- 
erential Emotion Scale (Izard, 1972) developed 
9n Vanderbilt subjects. 
dires trials were given, each consisting of (a) 
S Owing the subject 60 seconds in which to com- 
is the self-report scale; (b) presenting the 16 
(c) mation items, each visible for 8 seconds; and 
dc presenting the 14 fill-in-the-blank type ques- 
to ns, each visible for 8 seconds, with instructions 
it Write the answer. The order of information 
ems was constant over the three trials, while the 
di dona were randomized so as to minimize serial 
UE Similarly, the order of the self-report 
rds was randomized over the three trials in order 
eon the occurrence of a simple checking 
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RESULTS 


Subjects’ grades (N = 251) in their in- 
troductory psychology course correlated 
positively and significantly (r = .22, p < 
.001) with total correct responses on the 
complex verbal learning task. 

The following analyses were performed in 
order to examine the effects on performance 
of receiving a no, high, or low comparison 
level. A 3 X 3 analysis of variance was 
performed with the between-subject varia- 
ble of comparison level received (none, 
high, or low), the within-subject variable of 
trials (one, two, or three), and the depend- 
ent variable of number of correct responses. 
As shown in Figure 1, there was a signifi- 
cant effect due to comparison level received 
(F = 25.83, p < .001, à? = .06). The mean 
total correct responses for the low compari- 
son level subjects (X = 19.42) was signifi- 
cantly superior to the high (X = 15.47) and 
no comparison level performance XS 
13.85; t = 4.92, p < .001; and t = 7.04, p 
< .001, respectively), and the performance 
of subjects in the high comparison level 
tended to be superior to the no comparison 
level performance (t = 1.88, p < .06). The 
effect of trials reached significance (F = 
911.63, p < .001,? = .50) as did the inter- 
action of comparison level received and 
trials (F = 3.07, p < 02, à? = 00). This 
interaction stemmed from the relatively 
greater increase in performance over trials 
for the high as compared to the no and low 
comparison level received conditions. 

Table 1 shows that for all subjects 
grouped together, the "positive" self-re- 
ported affects correlated positively, and the 
"megative" affects correlated negatively 
with the total number of correct responses 
on the task. Furthermore, the positive and 
negative affects were correlated with per- 
formance for subjects in each comparison 
level condition. Generally, however, the af- 
fect-performance correlations did not differ 
significantly over the no, high, and low 
comparison level conditions. 

The following series of analyses were per- 
formed in order to ascertain whether the 
effects of comparison levels upon experi- 
mental performance were mediated by 
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JO LEVEL 


Fig. 1. Correct responses on the classroom-type complex verbal learning task as a func- 
tion of comparison level (none, high, or low) and trials (one, two, or three). 


ongoing self-reported affects. First, a series 
of nine 3 X 3 analyses of variance were 
performed in order to note differences in the 
nine affects over the three trials of the ex- 
periment. For each analysis of variance, the 
between-subject variable was the compari- 
son level received (none, high, or low), the 
within-subject variable was trials (one, 
two, or three), and the dependent variable 
was the score for the particular affect on 
each of the three trials. Of all the affect-de- 
pendent variables, the main effects of com- 
parison level received reached significance 


only for attention (F = 3.48, p < .03). The 
within-subject effects of trials reached sig- 
nificance for all affect-dependent variables 
(all ps « .001). For all of the affects as 
dependent variables, the interaction of com- 
parison level and trials was not significant, 
Because there was a significant difference in 
comparison level for attention as a depend- 
ent variable, subsequent ¢ tests were per- 
formed in order to compare the mean 0 
self-reported attention for the no, high, and 
low comparison levels. The ¢ tests showed 
that subjects in the low comparison level 


TABLE 1 
CORRELATIONS OF POSITIVE AND NEGATIVE AFFECTS WITH THE Toray NuMBER or Correct RESPONSES 
AS A FUNCTION or SUBJECT GROUPING 


Negative affects Positive affects 
Subjects 
Fear | Scom | Zh | shy counted | Angry | Ashamed 

Total 

(N = 251) — .19* | —.23* | —.24* | —.08 | —.21* | —.21* | —.24* 
No comparison level 

(N = 85) —-25 |—.25 | —.19 | —.03 | —.26 | —.20 | —.23 
High comparison level 

(N = 83) —.30* | —.34* | —.30* | —.09 | —.26 | —.30* | —.39* 
Low comparison level 

(N = 83) —-0L | —.17 |—.13 | —.06| —.13 | —.16 | —.13 


*p« 01. 
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were significantly higher on ongoing atten- 
tion than subjects receiving no comparison 
level (t = 2.46, p < .01), but were nonsig- 
nificantly higher than subjects receiving the 
high comparison level (£ — .52, ns). Fur- 
thermore, subjects in the high comparison 
level received condition were significantly 
higher on attention than subjects receiving 
no comparison level (t = 1.98, p < .05). 


Discussion 


The total number of correct responses on 
the complex verbal learning task correlated 
positively (r = .22, p < .001) with the sub- 
jects’ grades in their introductory psychol- 
ogy courses. This finding is consistent with 
previous research in our laboratory (e.g., 
Branham & Katahn, 1969; Ray et al., 
1971) indicating that the complex verbal 
learning task provides a fairly good ana- 
logue for making inferences to the learning 
process in the classroom. 

Snyder and Katahn (1970) examined the 
effect on performance of comparison level 
feedback consisting of high, average, and 
low scores, each presented to different 
groups as the mean attained by previous 
subjects on the same task. In a modified 
replication (Snyder &  Katahn, 1972), 
groups of subjects were given the same 
high, average, and low comparison level 
feedback as employed in the initial study, 
but the methodology differed in that sub- 
Jects were correctly informed as to which 
comparison level they received. The results 
of both studies showed that moving from 
groups receiving high, to average, to low 
Comparison levels, there was a significant 
effect for performance on the classroom- 
type verbal learning task to improve. In the 
Present study, subjects were correctly in- 
formed as to whether they were receiving a 
Ow or high comparison level, and addition- 
ally, one-third of the subjects were ran- 
Dens assigned to receive no comparison 
evel. Figure 1 shows significant differences 
experimental performance due to com- 
parison levels, with subjects in the low com- 
Parison level condition exhibiting the best 
Performance, subjects in the high compari- 
ŝon level condition exhibiting the second 

est performance, and those in the no com- 


497 


parison level condition doing the poorest. 
Over trials, the rate of performance in- 
crease for high comparison level subjects 
was greater than for those subjects in the 
other two conditions. The detrimental effect 
on the performance of the high comparison 
level, therefore, occurs initially in the ex- 
periment and diminishes over the trials. 

The results of the present and previous 
studies indicate the consistency of achieving 
improved performance through the low 
comparison level. One possible explanation 
for this phenomenon may be that subjects 
are constantly exceeding the performance 
standard set by the low standard of com- 
parison (90% of the low comparison level 
subjects in the present study exceeded the 
norms on each of the three trials), and that 
these subjects are therefore receiving posi- 
tive reinforcement regarding their perform- 
ance. Such subjects may be more highly 
motivated to continue to learn the material 
in order to exceed the low standard, while 
most subjects receiving the high standard 
may not be exceeding the performance 
standard (90% of the high comparison level 
subjects in the present study fell below the 
norms on each of the three trials), and 
therefore do not continue to pay attention 
and study. The fact that both the. low and 
high comparison level groups were superior 
to the no comparison level control groups 
indicates that comparison level feedback 
may be facilitative for classroom-type 
learning. The relatively poor performance 
for the no comparison level subjects may 
have occurred because, lacking a compari- 
son level to compete against, they may not 
have viewed the task as challenging, and 
therefore ceased both to pay attention and 
to continue to study. 

Because the task employed in the current 
study correlates fairly strongly with the 
subjects’ grades in introductory psychology, 
the comparison level results may have im- 
plications for the learning process in the 
classroom situation. À guiding principle of 
education has traditionally been to encour- 
age students to aim for the highest possible 
achievement, that is, to make a high grade, 
and therefore the value of comparing one’s 
performance with the higher standards of 
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performance may be tacitly imbued in stu- 
dents. As measured by the present and pre- 
vious studies in our laboratory, however, it 
may prove most beneficial for learning 
when the student compares his performance 
with a lower standard. A probable reason 
for this phenomenon is that for most stu- 
dents, the surpassing of a low standard 
eventuates in a “success” or “reward” expe- 
rience, while falling below a high standard 
eventuates in a "failure" or “punishment” 
experience. In terms of the optimal “learn- 
ing set”, then, the lower level of comparison 
may generate higher motivation than the 
higher levels of comparison. Furthermore, 
the current study indicated that giving stu- 
dents no standard of performance against 
which they may compare themselves results 
in the poorest learning. One explanation for 
this phenomenon is that because students 
are conditioned to compare their perform- 
&nce with high standards throughout their 
edueational careers, giving students no 
comparison level may cause them to become 
disinterested in learning the information. 

Table 1 shows that generally for all sub- 
jects, the positive ongoing self-reported af- 
fects correlated positively, and the negative 
affects correlated negatively with perform- 
ance. This finding replicates those found in 
the 1970 and 1972 Snyder and Katahn stud- 
jes using the same task as was employed in 
the current study. These results are logi- 
cally consistent because it would be expected. 
that subjects performing better would also 
experience more positive and less negative 
affect during the experiment. 

There is some evidence in the present 
study that the effects of the comparison 
level variables were mediated by ongoing 
self-reported affects. Previously in this sec- 
tion, it was hypothesized that the improved 
performance moving from the no, to the 
high, to the low comparison levels was re- 
lated to corresponding differences in atten- 
tion for these three groups. The ongoing 
measure of self-reported attention supports 
this hypothesis, as the subjects in the low 
comparison level condition self-reported 
themselves higher on attention than the 
high comparison level subjects, who in turn 
reported themselves higher on attention 
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than the no comparison level subjects, 
"These results highlight the importance of 
including self-report measures of affect dur- 
ing the experiment in order to ascertain the 
mediational role that affect may have re- 
garding the effects of independent variables 
upon classroom-related performance. 
Overall, the present results provide a use- 
ful extension of the comparison level con- 
cept as it has been previously studied, 
While previous theorists have examined the 
affective repercussions of the comparison 
level process, the influence of comparison 
level feedback upon the learning process 
has not been explored. As measured by the 
present and previous studies in our labora- 
tory, comparison level feedback may have 
important motivational effects for the class- 
room-type learning process. Perhaps most 
important, the effects of comparison level 
upon learning in the present study are an- 
tithetical to the high achievement orienta- 
tion typically found in American education, 
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Spontaneous and model-induced production of a valuational style of 
inquiry was studied in 128 third-grade children. Provision of a favor- 
able versus a neutral outcome expectation, and sex of child, failed to 
influence the results. All modeling groups displayed strong value-ques- 
tion increases over base line which, without further training, they gen- 
eralized to a new set of stimulus pictures. Four instructional var- 
iations, implicit, explicit, pattern (calling notice to an underlying 
similarity among the model’s questions), and mapping (exemplifying 
essential features of the model’s paradigm) proved to differ signifi- 
cantly in the postmodeling imitation phase but not in generalization. 


In a previous experiment, Rosenthal, 
Zimmerman, and Durning (1970) showed 
that observing a model’s styles of inquiry 
regarding a set of stimulus pictures pro- 
duced marked changes in the question for- 
mation of sixth-grade, primarily Mexi- 
can-American children from economically 
disadvantaged homes. Separate groups of 
youngsters observed the model create ques- 
tions based (a) on nominal or physical 
properties of stimulus objects, (b) on func- 
tional uses to which stimuli might be put, 
(c) on abstract relations concerning the 
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stimuli, or (d) on judgments of value or 
preference regarding the stimuli. All groups 
both adopted the model's interrogative par- 
adigms and, without further training, gener- 
alized them to new stimulus pictures. The 
children did not mimic the model’s words, 
but followed her stylistic criteria in making 
questions. When implicit instructions to 
emulate were compared with explicit in- 
structions to try to learn and follow the 
model’s questions, only the nominal-physi- 
cal groups revealed a significant benefit 
from the more explicit directions. It thus 
seemed of interest to investigate a broader 
range of instructional variations from the 
same minimal, implicit directions through & 
condition calling specific attention to the 
abstract properties governing the model's 
examples. For this purpose, it seemed ger- 
mane to study the value and preference 
question category, which was governed by 
criteria identical with Piaget's (1959, E 
217) definition of valuation and in which, 
formerly, the fewest base-line responses 
(virtually zero) were found. The attempt 
was made in the present experiment to hn 
tend the prior findings to à con j 
younger group of third-grade middle-clas 
children. Ne 
Situational "demand characteristics 
(Orne, 1962 and 1969), and “experimen 
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effects” (Rosenthal, 1966; Rosenthal & 
Jacobson, 1968) have been given prominent 
roles in the current social psychological lit- 
erature. In essence, researchers in this area 
have repeatedly found that situational cues 
conveyed by experimental procedures, and 
a research subject’s inferences about what 
the situation “requires,” or what the experi- 
menter anticipates or wishes to demon- 
strate, can strongly influence the results ob- 
tained. If the provision of an expectancy 
that one will do well, or that the experimen- 
ter wants and expects one to do well, were 
itself an operation adequate to facilitate the 
acquisition or transfer of abstract behavior, 
such a device would be valuable in fostering 
learning. Accordingly, half the children in 
each instructional variation were given ei- 
ther a favorable or a neutral expectation of 
their outcome performance by the experi- 
menter. 


METHOD 
Subjects 


From third-grade classrooms at two schools 
serving predominantly middle-class, Anglo-Ameri- 
can regions of Tucson, 64 boys and 64 girls were 
drawn randomly. To each of the eight (four in- 
structional X two expectation) experimental con- 
ditions, eight boys and eight girls were randomly 
assigned with the constraint that the proportions 
from either school be comparable; all data were 
collected in the midpart of spring semester when 
the children were typically about 9 years old. 


Materials and Model's Questions 


The stimuli were identical with those previously 
described (Rosenthal, Zimmerman, & Durning, 
1970). Two parallel but different sets of 12 pictures 
Were used; in each set, items showing one achro- 
matic common objeet (e.g, a typewriter) were suc- 
cessively alternated with items showing three 
variously colored common objects (eg, a yellow 
balloon, a yellow banana, and a red apple per 
card). Thus, to prevent response stereotypy, within 
each set, of stimuli, consecutive items varied in 
number, color, and pictorial content. The first set 
of pictures was displayed to all children during 

ase line, was the vehicle for the model’s questions, 
and then was readministered to all subjects to 
“ssess imitative changes. The second set of pictures 
Was subsequently presented without further inter- 
vention to all children, to assess generalization of 
Question formulation. 

For all subjects, the model's questions, in the 
Same order, were as follows: 

l. Which of these do you like best? 2. Do you 
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like this kind of typewriter? 3. Which do you 

think is the prettiest? 4. Would you rather sit 

on a park bench or on the ground? 5. Do you like 
the brush or the comb better? 6. What do you 

like about this? 7. Would you rather eat with à 

fork or a spoon? 8. Do you like to cut things 

with scissors? 9. Which would you rather hear, 
the drum or the bugle? 10. Do you like the Shape 
of this cup? 11. Which of these would you rather 
have for your own? 12. Do you like screws or 
nails better? 
At no time was praise or KR administered to the 
model's or the subjects’ questions, and no extrinsic 
incentives were offered or applied, 

Correct response was defined as any question 
relating to valuational matters about the stimuli, 
and thus was based only on the categorical proper- 
ties of the model's questions, not their specific 
content. A child's score was the sum of the trials 
on which he produced value-related questions. It 
was found in the prior study that such scoring was 
highly reliable and easily accomplished; in the 
present data, fewer than 2% of response instances 
required discussion by the authors to make scoring 
decisions. All data were collected by the same adult 
male experimenter and adult male model, 


Procedure and Design 


The child was taken individually to a testing 
room by the experimenter who there introduced 
him to the model. In base line, the experimenter 
instructed the child as follows: 


I'm going to show you a set of cards. Ask some- 
thing about each card. Here is the next card, ask 
something about it, etc. 

Expectation variations. After base line, the ex- 
perimenter instructed the favorable expectation 
subjects as follows: 

I have been working with this game for a long 

time and with a lot of students. From the way 

you did, I can tell that you really have talent for 
this game. Once you get the hang of it, I’m sure 
that you are really going to do a great job! With 
your talent, all you need are a few hints. 
The foregoing comments were omitted in the 
neutral expectation condition, and the experimen- 
ter then presented the instructional variations as 
follows: 

Implicit instructions. For this, the experimenter 
said: 

Now this man is going to make up a question 

about each picture. You watch carefully, and you 

will have another turn later. (The model per- 
formed.) Now you can have another turn to 
make up questions about each picture. 

Explicit instructions. These directions were: 

Now this man is going to make up a question 

about each picture. You watch carefully and try 

to learn his questions just as well as you can, and 
you will have another turn later. (The model 
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performed.) Now you can have another turn, 
etc., (plus) Try as hard as you can to make your 
questions like the man’s questions. 


Pattern instructions. These instructions were: 


Now this man is going to make up a question 
about each picture. You watch carefully and try 
to learn his questions just as well as you can. All 
of his questions are the same in a certain way. 
Try to learn how his questions are the same, and 
you will have another turn later. (The model 
performed and, afterward, the child was given 
the same final directions as in the explicit treat- 
ment.) 


Mapping instructions. For the mapping instruc- 
tions the directions were : 


This man is going to make up & question 
about each picture. You watch carefully and try 
to learn his questions just as well as you can. All 
of his questions are the same in a certain way. 
That is, all of his questions ask about ‘Which do 
you like? ’, ‘What is prettiest? ', ‘Which would 
you rather have? ’ Try and learn his way of mak- 
ing questions and you will have another turn 
later. (The model performed and, afterward, the 
child was directed as in the explicit treatment.) 


Subsequent to readministration of the initial 
stimuli, the new set of generalization pictures was 
introduced, without modeling, and all children re- 
ceived the same instructions as follows: “Here are 
some new cards; ask a question about each one.” 
The model recorded the child’s question responses 
and hence was present for all procedures. 

To assess the child’s perceived level of success, 
after completing the experiment he was asked, 
“Did you do a good job on the game? ”, and the 
answers were scored dichotomously as yes or no 
for later analysis. 

Preliminary analyses were first performed on the 
change scores from base line to imitation, and from 


TABLE 1 
MEAN VALUE-PREFERENCE QUESTIONS BY PHASE 
FOR EXPERIMENTAL Instructions Groups 
a 
Phase 


en 


Group 
Base line Imitatie General- 
| —— | ization 


Present study: Third graders 
PEE ARETE EE E A A E 


Implicit 0.32 5.91 5.09 
Explicit 0.19 9.40 6.66 
Pattern 0.00 8.16 5.69 
Mapping 0.06 10.31 6.44 
— a UM pt." 


Prior study: Sixth graders 


0.21 
0.07 


Implicit 
' Explicit 
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base line to generalization phases, for Sex X In- 
structions X Expectation. These analyses (which 
gave results similar to those reported below) re- 
vealed no significant sex effects or interactions with 
any other variate, nor suggestive trends (largest 
F — 191, p > .13). The sexes were therefore com- 
bined in the main analysis which factorially com- 
pared the four instructional X two expectation 
treatments across base-line, imitation, and generali- 
zation phases as trials. Given a significant? overall 
effect, Newman-Keuls tests (Kirk, 1968) were used 
to evaluate differences within and between phases, 
and to compare pairs of groups. 


Resuits 


The favorable versus neutral expectation 
groups utterly failed to differ in the main 
analysis of variance, nor did expectation in- 
teract with instructions or across-phases 
change (all Fs < 1.0; ns). The expectancy 
variations thus were combined for all spe- 
cific comparisons, and also in Table 1 which 
presents the instruction groups’ means by 
phase. 

Observing the model led to strong in- 
creases from base line across trials (F = 
293.89, df = 2/240, p < .0001). Although 
the groups were comparable in base line, 
the premodeling instruction variations 
created significant between-group differ- 
ences (F = 3.48, df = 3/120, p < .02) and 
instructions interacted with trials (F = 
4.18, df = 6/240, p < .001). ^ 

Aggregately, the children surpassed their 
base-line scores in the imitation and in the 
generalization phases (both ps < 01). 
Moreover, when separately compared across 
trials, each modeling group significantly ex- 
ceeded its base-line scores in both imitation 
and generalization phases (all ps < .01). 

Further analysis of the interaction term 
disclosed that, in the imitation phase, the 
minimal, implicit instructions group sco: 
lower then each of the other groups (all ps 
X .01), and the mapping group surpas' 
the pattern instructions group (p < 05). 
No other selected comparison attained sig- 
nificance, nor did instruction treatments 
differ in their generalization phase scores. 

It is of interest to compare the present 
third graders, under implicit and explicit 
instructions, with the corresponding treat- 


* All tests of significance reported in this paper 
were based on two-tailed probability estimates. 
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ments for the prior sixth-grade sample. 
Although not significantly different by un- 
weighted-means analysis of variance proce- 
dures (Kirk, 1968), the younger group nu- 
merically surpassed both imitation and gen- 
eralization phase means of their older coun- 
terparts (see Table 1). Obviously, one can- 
not isolate the separate effects of age, eth- 
nicity, and socioeconomic status, but the 
older group was hampered by linguistic and 
economie marginality to a degree not fully 
equalized by their 3-year age advantage 
over the middle-class youngsters. 

After completing all other procedures, 
each child's answer to the question, “Did 
you do a good job?”, was recorded. The 
proportions of yes responses for the four 
instruetional variations, and the point-bi- 
serial correlations between self-reported suc- 
cess and each group’s imitation and gener- 
alization scores are presented in Table 2. 

By a chi-square analysis, the groups dif- 
fered significantly in judging their success 
(7 = 832, df = 3, p < .05). Yet, there was 
little relationship between children’s verbal 
reports and actual group attainment: Al- 
though the best-performing group (map- 
ping) also gave the most favorable judg- 
ments, the second-best group (explicit) 
judged themselves as poorest, and the two 
lowest-scoring groups gave intermediate 
self-reports. Similarly, when each child's 
Scores were related to his self-report, no ev- 
idence of systematic covariation was found; 
only one of eight coefficients was significant, 
and, in two cases, the correlations between 
actual and self-judged performance were 
hegative. Thus, despite clear evidence of 
Conceptual learning, children’s verbaliza- 
tions did not reflect abstract performance, a 
result reminiscent of other research (vide 
Kendler & Kendler, 1967; Morris, 1970) 
Showing that, with young children, no sim- 
ple relationships may exist between verbal 
labeling and actual inferential behavior. 


Discussion 


From vicarious training procedures, all 
groups of children were able to adopt a 
novel question paradigm, and to generalize 
its rule-goverened features to new stimuli 
without further guidance. Thus, the pre- 
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TABLE 2 


Proportion or CHILDREN REPORTING SUCCESS BY 
GROUP, AND Pornt-BiseRIAL COEFFICIENTS 


BETWEEN SELF-REPORTED SUCCESS AND 
IMITATION AND GENERALIZATION 
PnasE Scores 
Correlations 
Group Proportions 
AU General- 
Imitation paries 
Implicit .636 .22 .15 
Explicit .455 .51* .22 
Pattern .636 03 —.21 
Mapping .788 08 —.17 
*p< .01. 


vious results were extended downward to 
third graders, across sex of child and of 
model (now male, formerly female). In the 
imitation phase, informative directions en- 
hanced response over implicit procedures, 
especially for the mapping group whose at- 
tention was directed to the categorical 
properties of the questions. 

Bandura (e.g., 1969 and 1971) discussed 
the factors making observational methods 
powerful means for transmitting complexly 
organized behavior. He proposes that, by 
contiguity, modeling stimuli are percep- 
tually conditioned to covert mediational 
responses in the observer who, after sym- 
bolic coding, is then able to reproduce sub- 
stantial portions of the modeled display. In 
the symbolic coding process, both imaginal 
and verbal representations are assumed to 
be established. These mediating codes 
maintain behavior in time, and subse- 
quently guide it, thus permitting the trans- 
fer of observationally acquired responses to 
new stimulus configurations consistent with 
the generalization effects presently found. 
Instructions to the observer help direct the 
attentional responses necessary for contig- 
uous sensory conditioning, and may facili- 
tate the covert formation of verbal repre- 
sentations. 

In light of this formulation, it is interest- 
ing to compare the relative contributions of 
demonstration through modeling, versus in- 
structions, in producing the observed out- 
comes. Within the scope of the present op- 
erations, two obvious comparisons are pos- 
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Bible and both assign the larger informa- 
tion-conveying role to vicarious training: If 
one treats the highest group mean (map- 
ping, in the imitation phase) as unity, then 
the  pure-modeling  implicit-instructions 
group attained over 57% of maximum per- 
formance. If instead one examines the gen- 
eralization phase data, the pure-modeling 
group performed almost as well as did any 
combination of modeling plus instructions. 

Giving children a favorable expectation 
did not improve their task performance, but 
this should not lead to a premature conclu- 
sion that teacher's attitude does not affect 
scholastic progress. One's attitude may in- 
fluence other actions which, more directly, 
contribute to learning. Also, vicarious train- 
ing can modify complex behavior very rap- 
idly. The cumulative effects of motivational 
or expectancy variables may require a 
longer time span. Thus, Rosenthal, Coxon, 
Hurt, Zimmerman, and Grubbs (1970) 
found that teacher attitudes were greatly 
changed by an experimental training pro- 
gram. These attitude effects were elsewhere 
(Rosenthal, Underwood, & Martin, 1969) 
shown to be associated with important as- 
pects of classroom behavior, including chil- 
dren’s willingness to solicit teacher atten- 
tion. Similarly, Rosenthal, and Jacobson 
(1968) found that teachers’ expectations 
about students’ late-blooming capacity 
strongly influenced intellectual accomplish- 
ment. In all the foregoing motivational re- 
search, longitudinal procedures were used 
such that expectancy factors were sustained 
for from 1 to 3 years, quite unlike the brief 
single-session present design. 


TED L. ROSENTHAL AND BARRY J. ZIMMERMAN 


REFERENCES 


Banpura, A. Principles of behavior modification, 
New York: Holt, Rinehart, & Winston, 1969, 
Banpura, A. Theories of modeling. New York: 

Atherton Press, 1971. 

Kenner, T. S, & KrwpLEes, H. H. Experimental 
analysis of inferential behavior in children, In 
L. P. Lipsett & C. C. Spiker (Eds.), Advances in 
chid development and behavior. Vol. 3. New 
York: Academic Press, 1967. 

Kmer, R. E. Experimental design: Procedures for 
the behavioral sciences. Belmont, Calif.: Brooks/ 
Cole, 1968. 

Monais, L. A. Relational responding to a transposi- 
tion task as a function of relevant verbalizations 
and feedback. Unpublished doctoral dissertation, 
University of Arizona, 1970. 

Orne, M. T. On the social psychology of the psy- 
chologieal experiment: With particular reference 
to demand characteristics and their implications, 
American Psychologist, 1902, 17, 776-783. 

Orne, M. T. Demand characteristics and the con- 
cept of quasi-controls. In R. Rosenthal & R. L. 
Rosnow (Eds.), Artifact in behavioral research. 
New York: Academic Press, 1969. i 

Prager, J. The language and thought of the child. 
(8rd ed.) London: Routledge & Kegan-Paul, 
1959. 

RosentHat, R. Ezperimenter effects in behavioral 
research. New York: Appleton-Century-Crofts, 
1966. 

RosENTHAL, R., & Jacosson, L. Pygmalion in the 
classroom: teacher expectation and pupils’ in- 
tellectual development. New York: Holt, Rine- 
hart, & Winston, 1968. 

RosentHat, T. L., Coxon, M., Hurt, M., Zimmes- 
an, B. J., & Grusss, C. F. Pedagogical attitudes 
of conventional and specially-trained teachers. 
Psychology in the Schools, 1970, 7, 61-66. 

RoseNTHAL, T. L., Unpsrwoon, B., & Marri, M. 

ing classroom incentive practices. Jowi 
of Educational Psychology, 1969, 60, 370-376. 

RoseNTHAL, T. L., ZIMMERMAN, B. J., & DURNING, K. 
Observationally-induced changes in children’s in- 
terrogative classes. Journal of Personality 
Social Psychology, 1970, 16, 681-688. 


(Received June 9, 1971) 


vurnal of Educational Psychology 
Journal d. No. 6, 605-812 


DELAY-RETENTION EFFECT WITH MULTIPLE-CHOICE TESTS’ 


RAYMOND W. KULHAVY? ann RICHARD C. ANDERSON 


University of Illinois 


High school juniors and seniors completed & multiple-choice test 
on topies in introductory psychology under various conditions of 
immediate and delayed feedback. On the same test, a week later, 
delayed-feedback groups performed significantly better than immedi- 
ate-feedback groups. Groups that studied the feedback booklet prior 
to the initial test performed best of all. Analyses of the likelihood of 
forgetting responses to the first test over the delay interval, the 
probability of repeating initial errors on the final test, and feedback 
study time supported the conclusion that the delay-retention effect 
primarily is due to the forgetting of interference-producing errors 
during the delay interval and, secondarily, to increased attention to 


the feedback after a delay. 


Research with meaningful multiple- 
choice tests has shown that learners who 
receive immediate knowledge of the correct 
responses, or feedback, retain less than 
learners for whom feedback is presented 
after a period of delay. This phenomenon 
has been labeled the Delay-Retention Ef- 
fect (DRE) by Brackbill and her associates 
who were among the first to define its prop- 
erties (e.g. Brackbill, Bravos, & Starr, 
1962; Brackbill & Kappy, 1962). 

Delay of feedback consistently facilitates 
only in the case where the task involves 
Meaningful verbal matter similar to that 
encountered in instruction. In the more 
basic human learning paradigms, for exam- 
ple, discrimination learning, concept forma- 
tion, and so on, facilitation due to delay is 
much less consistent, and often the results 
Point in the opposite direction. However, 
the aim of this research was not to question 
the reliability of the effect (it undoubtedly 
exists), but rather to provide an adequate 
explanation for its occurrence. 
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We were able to find 11 experiments in- 
volving a feedback delay with meaningful 
materials (English & Kinzer, 1966; Moore, 
1969; Phye & Baller, 1970; Sassenrath & 
Yonge, 1968, 1969; Sturges, 1969; Sturges & 
Crawford, 1963, 1964). Of the 11, only 3 
failed to find superior retention for the 
delay compared to the immediate groups 
(Sturges & Crawford, 1964, Experiments II 
and IV; Phye & Baller, 1970). 

The paradigm used in DRE studies con- 
tains five components. Schematically, these 
components are: 


F Tas 
^ di Ti v 
Where: T; signifies initial exposure to the 
multiple-choice items; 


Fis the T, items with the correct 
responses identified (feedback); 

Te the second presentation of the 
test (the dependent measure); 

d; a delay interval of some time ta; 
and 

ri à retention interval of some time 


The standard DRE experiment includes 
two groups, each receiving T, F, and T». 
The treatments differ in that the delay 
group receives F following some specified 
interval (ta = n), and the no delay group is 
given the F immediately following T; (ta 
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= 0). The treatments can be represented as 
follows: 


NO DELAY GROUP 


T; ———— — F——————T 
"(ta = 0) (6-23 
DELAY GROUP 
T: 


(seh is m (sin stia 


A variety of interpretations have been 
proposed to account for the DRE. On one 
point, all of the investigators agree. The re- 
search on the DRE with meaningful multi- 
ple-choice tests is inconsistent with the ani- 
mal literature on delay of reinforcement. 
This is shocking if you believe feedback 
functions as reinforcement. However, the 
best evidence is that feedback is largely im- 
portant as correction for errors (Guthrie, 
1971). ( 

Sturges and Crawford (1964) have main- 
tained that delay of feedback improves 
later performance on a multiple-choice test 
because subjects “engage in some relevant 
covert symbolic activity in the interval be- 
tween initial presentation and presentation 
of (feedback) [p. 19].” In other words, they 
argue that subjects who do not recieve im- 
mediate feedback spend more extraexperi- 
mental time rehearsing the questions than 
subjects who receive immediate feedback. 
Why this should be the case is difficult to 
understand. Sturges and Crawford (1964) 
also fail to make clear what the processes 
involved in the rehearsal activity might be, 
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or in what manner they function to promote 
retention. 

Sassenrath and Yonge (1968, 1969) have 
postulated that because human subjects use 
language they are able to relate delayed 
feedback to earlier learning. This is perhaps 
a description of, but not an explanation for, 
the DRE. These same authors suggest the 
existence of implicit response-produced cueg 
that give birth to further cues at the delay 
point, the retention test, or both. As was the 
case with Sturges and Crawford (1964), 
scant attention was paid to how these 
mechanisms might operate, or in what spe- 
cific way they bring about improved reten- 
tion. 

In our judgement, the explanations ad- 
vanced to this point fail to adequately ac- 
count for why the DRE occurs with mean- 
ingful material. Our explanation is very 
simple: learners forget their incorrect re- 
sponses over the delay interval, and thus 
there is less interference with learning the 
correct answers from the feedback. The 
subjects who receive immediate feedback, 
on the other hand, suffer from proactive in- 
terference because of the incorrect responses 
to which they have committed themselves, 
This explanation will be called the interfer- 
ence-perseveration hypothesis. 

Table 1 shows the analogous components 
in the proactive interference and DRE par- 
adigms, given a correct or incorrect re- 
sponse on the first administration of the 
test. The assumption is that taking a test 
strengthens response tendencies. According 
to the interference-perseveration hypothe- 


TABLE 1 
ANALOGOUS DRE AND INTERFERENCE COMPONENTS FOR THE Case or EITHER A CORRECT OR INCORRECT 


ANSWER ON THE First ADMINISTRATION OF THE TEST 


T y 


Error on first test 


Correct response on first test 


A c A c 
lus Aces 
Item Correct Item Correct 
Stem Choice Stem Choice 
Item Correct Item Correct 
Stem Choice Stem Choice 


| 
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sis, when a person makes an error on the 
first test, he strengthens an A-B connection 
which then interferes with acquiring an 
A-C connection from the feedback. Proac- 
tive interference is greatest when stimuli in 
successive tasks are identical and the re- 
sponses are dissimilar. This, it is argued, is 
the condition that prevails when an incor- 
rect response is made on the first test. Ac- 
cording to this analysis, a person who 
makes a correct response choice on the first 
test places himself in the A-C A-C para- 
digm, a condition known to facilitate reten- 
tion. 

There is evidence that taking a test 
strengthens response tendencies (Anderson 
& Myrow, 1971; Roderick & Anderson, 
1968; Rothkopf, 1966; Spitzer, 1939). The 
results of these studies have consistently 
demonstrated that receiving a test immedi- 
ately or shortly after instruction improves 
performance when the test is given again, 
even when feedback is not provided. 

There is also research that shows that er- 
Tors perseverate (e.g., Cunningham & An- 
derson, 1968; Elley, 1966). These studies 
indieate that errors committed early in 
learning tend to reappear later, in spite of 
the fact that a subject receives feedback 
between presentations. Of special interest is 
the study of Kaess and Zeaman (1960) be- 
Cause it employed a multiple-choice test 
Covering meaningful material. Kaess and 
Zeaman manipulated initial error probabil- 
ity by varying the number of incorrect al- 
ternatives presented to subject answering 30 
multiple-choice test items dealing with in- 
troductory psychology. As predicted, errors 
Made on the initial test tended to persever- 
ate to later trials on the same items. Only 
after several trials with feedback were the 
earners able to overcome the tendency to 
Tepeat the errors. Because of the materials 
and procedures used these results provide 
Strong support for the perseveration notion. 
y According to our analysis the first admin- 
Istration of the test and the presentation of 
feedback in a DRE study are comparable, 
Téspectively, to the learning of the first and 

€ second list in a proactive inhibition ex- 
periment, If the analysis is sound, a delay 

etween the two lists in the latter type of 
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study should reduce proactive inhibition. 
This is precisely what happens (Abra, 1969; 
Underwood & Ekstrand, 1967; Underwood 
& Freund, 1968). 

A second possible factor contributing to 
the DRE is a decrease in attention paid to 
feedback when it is presented immediately 
after a difficult test. Under these conditions, 
subjects may be tired and frustrated, and 
thus unlikely to study feedback as carefully 
as they might otherwise. The hypothesis of 
Sturges and Crawford (1964) involves in- 
creased processing of the material when 
feedback is delayed; however, their expla- 
nation focuses on the delay interval rather 
than the point at which feedback is given. 
An analysis of feedback inspection times 
should determine the degree to which the 
DRE is influenced by attention. If delay of 
feedback affects attention, the delay-of- 
feedback groups would be expected to take 
longer to complete study of feedback than 
immediate-feedback groups. 


METHOD 


Subjects 


The subjects were 194 high school students 
representing the entire junior and senior classes at 
the high school in a small midwestern town. None 
of the subjects had taken a course in psychology, 
Subjects participated in their regular class groups, 
Thirteen additional subjects were dropped from 
the experiment for failing to attend one or more 
of the experimental sessions. 


Control Conditions 

In order to assess the effects of repeated testing 
or changes in the subject population over the 
course of the study, three control groups were in- 
cluded. One repeated the test on each day of the 
experiment. The second took the test on Day 2 
and again on Day 7. The third received the test 
only on Day 7. When not completing the test, 
control subjects read passages on an irrelevant 
topic. 
Treatment Conditions 

The eight experimental treatment conditions 
are di: e ed below. In all cases Tı represents 
the first exposure to the test items. Feedback (F) 
refers to the same test items with the correct 
answers underlined. In this case, the subject was 
required only to study the items and to learn the 
correct responses. Where F is followed by the letter 
J (judge) the subject was instructed to identify 
his previous Ti response at the time of feedback 
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by circling the T; alternative previously chosen. 
Ta signifies the posttest measure in which the test 
items were administered in the same manner as in 
Tı. The notation P (placebo) represents & prose 
passage given to subjects not receiving experi- 
mental materials during a given session. 


Condition 


Abbreviation Sequence 
Day 1 Day2 Day? 
TF THF ipaa 
TF—J ITUR en Ji Jis Py ovis 
TDF T: EF. D 
TDF —J Ti F-J T 
FT F+T, P Ug 
FDT F Tana dos 
F F PUIAR 
DF P F ot 
Materials 


The experimental test consisted of 35 four- 
alternative, multiple-choice items dealing with 
topics in introductory psychology. Initially, a pool 
of 50 such items were gathered ; 90 from previous 
DRE research (eg, Sassenrath & Yonge, 1968, 
1969; Sturges, 1969; Sturges & Crawford, 1963 and 
1964, Experiment I), 10 from a programmed text 
on vision (O’Day, Kulhavy, Anderson, & Malcynz- 
ski, 1971), and 10 constructed by the present 
authors, Prior to the experiment the 50 items were 
given to 46 students from a suburban high school, 
who were asked to respond to them without prior 
subject-matter instruction. The 15 items answered 
correctly most often were deleted from the test. 
On none of the items included in the study did 
performance vary greatly from chance. 

The items were mimeographed on 54% X 8 inch 
sheets of paper, and stapled into booklets that 
formed the tests. The feedback booklets were 
identical with the exception that the correct re- 
sponse was underlined for each item. The order 
of questions in each booklet containing either test 
items or feedback was Separately randomized for 
each subject on each occasion. 

The placebo material consisted of a 2,240-word 
prose passage describing a fictitious African tribe, 
the “Himoots.” When another placebo was re- 
quired a second, 2,190-word Passage was used that 
detailed a different tribe, the “Gruanda.” Neither 
placebo passage contained material related to the 
test items. 


Procedure 


The experiment was conducted in three Sessions, 
each 1 hour long. The first two sessions were held 
on successive days and the final session 1 week 
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following the first day. Envelopes containing 
materials for each of the 11 conditions were dis. 
tributed randomly in a given classroom, with the 
exception that subjects were assigned to the con- 
trol groups at a lower per-classroom ratio. Sub. - 
jects in each of the 11 conditions participated ` 
simultaneously in all of the nine classrooms in- 
volved. 

To attenuate the effects of group time pres- 
sure, subjects were repeatedly told that they were 
working on different tasks which would probably 
require various lengths of time to complete. De- 
tailed instructions concerning the procedures to be 
followed in each condition were contained in the 
envelopes with the experimental materials. When 
a subject completed the initial task he signaled the 
experimenter, who removed the completed mate- 
rials. At this point, in Conditions TF, TF-J, and 
FT, the subject was given a second envelope con- 
taining the remaining material to be completed, 
There was no time limit imposed on completion 
of any of the materials. All subjects completed 
their assigned tasks within their regular class hour, 
The time a subject spent in studying the feedback 
booklets was recorded by the experimenter to the 
hearest minute. At no time on either the first or 
second day were subjects told that there would be 
additional experimental sessions. m 

All of the subjects, regardless of condition, 
received the test, a six-item questionnaire, and 8 
verbal ability measure on the last day of the 
Study. The questionnaire assessed the subjects 
activities during the experiment and between ses- 
sions. Following the test and the questionnaire, 
subjects were given the Wide Range Vocabulary 
Test, a 48-item test of verbal ability (French, 
Ekstrom, & Price, 1963). Within each of the 11 
conditions, subjects were divided into high- and 
low-ability levels based on the overall verbal 
ability median. 


RrsurTS 


Control Conditions 


Table 2 contains the mean numbers of 
correct responses for all groups on both 
tests. Analysis of the variance on the de- 
layed test indieated no significant di 
ences among the control groups (F — 418, 
= 2/29). These results affirm that there 
were no differences due to repeated testing 
or changes in the population over the dura- 
tion of the study. mi, 

All experimental groups performed smit 
icantly better (« = .01) on the delayed t 
than the pooled control groups. 


Experimental Conditions 


First test. An unweighted means E 
of first-test variance yielded significan! 
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TABLE 2 
UNWEIGHTED MEAN NUMBER OF Correct RESPONSES ON THE TESTS 
Experimental conditions Control conditions 
Item 
- - Oni 

TF | TFJ | TOF | TDFJ| FT | ror | F DEO a pae aks 
N 2 |20 |20 |19 |18 |22 2 | 2 | 12 10. 
First test 9.70 | 9.25 | 10.55 | 9.82 | 29.89 | 29.18) — as Um 9.20 
Delayed test 17.57 | 15.35 | 23.05 | 20.97 | 26.22 | 27.36 | 19.80 | 21.68 | 10.50 | 9.70 | 10.00 


a The mean for this group on the second test was 11.50. 


fects for treatment (F = 193.18, df = 5/ 
108, p < .01). Newman-Keuls tests indi- 
cated that Groups FT and FDT performed 
significantly better than all the remaining 
groups. No other comparisons were signifi- 
cant. 

Delayed test. Completed first was an 
analysis involving only Groups TF and 
TDF to determine whether the usual DRE 
occurred under the conditions that pre- 
vailed in this study. Delay of feedback did 
produce a significant increment in delayed 
test scores (F = 9.52, df = 1/37, p < 01). 

Done next was an unweighted means 
analysis of delayed test variance involving 
all eight experimental groups. Both treat- 
ment (F = 8.94, df = 7/146, p < .01) and 
verbal ability (F = 15.91, df = 1/146, p < 
01) were significant sources of variance. 
Table 3 summarizes Newman-Keuls com- 
parisons among treatments. 

_ Conditional probability analysis. Accord- 
mg to the interference-perseveration hy- 
Dothesis, subjects who receive immediate 
feedback are less able to recover from er- 


rors on the first test than subjects who re- 
ceive delayed feedback. In other words the 
conditional probability of a wrong response 
on the delayed test (W2) given an incorrect 
response to the same item on the first test 
(W:) should be higher in groups that re- 
ceive immediate rather than delayed feed- 
back. The mean values of P(W2|W:), and 
also P(Re|Ri); for the groups that re- 
ceived the first test appear in Table 4. 

Considering just the four groups that re- 
ceived the test as the first experimental 
event, an unweighted means analysis of the 
are sin transforms of P(We|W:) indicated 
an effect for delay of feedback (F = 15.00, 
df = 1/71, p < .01) but none for whether or 
not the responses to the first test were 
judged (F = 3.10, df = 1/41) or for verbal 
ability (F = 3.30, df = 1/71). There was 
substantially greater error perseveration 
when feedback was presented immediately 
than when it was delayed. 

If you believe feedback functions as rein- 
forcement, you would expect P(R,|R:) to 
be higher when feedback is given immedi- 


TABLE 3 
VALUES or q Fog NEwMAN-KEULS COMPARISONS OF Detayep Trst SCORES 
‘Treatment conditions 
Quar | 7 i oe SIGEDUSM 
TF F TDF-J DF TDF FT FDT 

ua D. 7.99* 8.84* 

ID 4.19** 4.60* | 5.67* $ x 

uis oh ed 2.50 3.02 40i** | 6.37* 7.20* 

E . <86 1.38 2.39 4 | 5.56* 

TDFJ 52 1.53 3.86 4. TQ** 

DE 1.01 3.34 4.18 

TDF 2.33 3.17 

FT a 
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TABLE 4 
Mean CONDITIONAL PROBABILITIES 


Item 

TF | TF-J| TDF |TDF-]| FT | FDT 

P(R4R)) -70 | .65 | .78 | .72 | .80 | .87 
P(W,[Wi) -69 | .63 | .39 | .44 | .46 | .61 


ately. An analysis of the arc sin transforms 
of P(Re|Ri) involving the four experimen- 
tal groups that began with the first test 
showed no difference due to delay of feed- 
back (F = 2.67, df = 1/71) or whether 
first-test responses were judged (F = 1.67, 
df — 1/71). People with high verbal ability 
repeated a higher proportion of responses 
correct on the first test than people of low 
ability (F = 8.33, df = 1/71, p < 01). 

Accuracy with which responses on the 
first test were identified. If the interfer- 
ence-perseveration hypothesis is correct, 
Subjects tend to forget their initial respon- 
ses during the feedback delay interval. The 
two groups that judged initial responses 
were included to check this assumption. 
‘Pooling the proportions of right and wrong 
responses correctly identified, that were 
negligibly different, the immediate-feed- 
back group (TF-J) correctly identified .44 
of the first-test responses whereas the pro- 
portion identified by the delayed-feedback 
group (TDF-J) was only .18. This differ- 
ence was significant (z = 10.40, p < .01). 

Feedback study time. Table 5 reports the 
mean time in minutes that the various ex- 
perimental groups spent studying the feed- 
back. There were significant differences 
among treatments (F = 3.21, df = 7/146, p 
< .01), but not ability levels (F = 1.77, df 
= 1/46). Newman-Keuls comparisons indi- 
cated that, with the exception of Group FT, 
groups that received feedback as the first 
event in a session spent more time on the 
feedback than groups which first received 
the test. 

Apparently the DRE could be due in 
whole or in part to increased attention to 
the feedback following a delay rather than 
interference perseveration. To clarify this 
matter, analyses of covariance were per- 
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formed on both the P(R;[Ri) and 
P(W2|W1) measures, with delay of feed. 
back and presence or absence of response 
judging as the factors, and verbal ability 
and feedback study time as covariates, The 
only significant effect in either analysis was 
the delay factor with the P(Ws|W;) meas- 
ure (F = 9.58, df = 1/74, p < .01), By 
inference, there is still strong support for 
the interference-perseveration hypothesis, 

If it is assumed that the DRE is due to 
interference-perseveration and/or increased 
attention to feedback following a delay, it 
is possible to compare the explanatory 
power of the two hypotheses using the pres- 
ent data. Estimates of the variance in 
P(W2|W:) due to the delay-of-feedback 
main effect were obtained in two separate 
analyses. In the first analysis, with only 
verbal ability as a covariate, omega 
squared (w?) was .150. In the second anal- 
ysis with both verbal ability and feedback 
study time as covariates, w? = .095. The 
ratio. (o;? — we*/w;2) estimates the de- 
lay-of-feedback variance attributable to 
study time. For the present data, this value 
expressed as a percentage is 36.7%. The 
percentage attributable to interference per- 
severation estimated from the ratio, os/ 
1’, is 63.3%. 


Postexperiment Questionnaire 


The results from the questionnaire failed 
to indicate any type of rehearsal during the 
delay or retention intervals such as Sturges 
and her associates have hypothesized. Of 
course, the subjects who received delay 
feedback reported having greater difficulty 
remembering their initial responses than di 
subjects who received feedback immedi- 
ately. 


TABLE 5 


NT 
UnweicuTep MEAN Tie iN MiNUTES SPP 
ON FEEDBACK 


Treatment Conditions 


F| DF 
TF | TFJ | TDF | TDF-J| FT | FDT M 
11.57] 11.35/14.25| 14.7212.94| 15.27|15.35) 14.77 


| 
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Discussion 


The results of this study provide une- 
quivocal support for the interference-per- 
severation interpretation of the DRE. In 
laying a theoretical base for the experiment 
it was argued that delay of feedback on a 
difficult test will improve performance when 
the test is given again later because initial 
error responses will tend to be forgotten 
during the delay interval and, consequently, 
interfere less with learning of the correct 
responses from the feedback. There are 
three parts to the argument. The first is 
that a test strengthens response tendencies. 
The fact that the groups which received 
“feedback” and then took the test per- 
formed substantially better on the delayed 
test than groups that received feedback but 
did not take the initial test supports this 
contention. The second argument is that, 
following a delay, subjects forget the re- 
sponses to which they committed themselves 
on the initial test. In support of this propo- 
sition, a smaller proportion of initial re- 
sponses were identified at the time of feed- 
back after a delay than when feedback was 
given immediately. Finally, it is argued 
that error tendencies interfere with the 
learning of the correct answers from the 
feedback. Putting the arguments together, 
it follows that there will be greater persev- 
eration of errors when feedback is immedi- 
ate than when it is delayed. The results 
showed that the probability of repeating an 
initial error on the delayed test was 
markedly higher for immediate-feedback 
groups than for delayed-feedback groups. 

Despite the strong evidence for interfer- 
ence-perseveration interpretation the feed- 
back study-time data suggested a second- 
ary explanation based on attention. All of 
the delay-of-feedback groups took longer 
than their immediate counterparts to study 
the feedback. The covariance analyses that 
did, and did not, hold time constant indi- 
cated that, while nearly two-thirds of the 
delay-of-feedback variance was attributa- 

le to interference-perseveration, more than 
one-third was attributable to increased 
study time, 

There is evidence, then, that attention is 
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a secondary factor in explaining the DRE. 
Quite probably, the decreased time spent 
studying feedback immediately following a 
difficult test is due to fatigue and frustra- 
tion. This interpretation is supported by the 
fact that the groups receiving feedback on 
the first day took significantly more time to 
study the feedback than those groups which 
received feedback after the initial test. 
These results rule out the attention-be- 
cause-of-test and uncompleted-task hy- 
potheses, since the former groups received 
no initial test over the material, and the 
questionnaire data indicates that they were 
not expecting to receive one at a later date. 

The data from this study fail to support 
the view that feedback acts as reinforce- 
ment. If feedback were reinforcing, immedi- 
ate feedback would be expected to increase 
the likelihood of repeating initially correct 
responses. In fact, the probability of repeat- 
ing correct responses on the final test was 
no higher for immediate-feedback groups 
than for the delayed-feedback groups. 

In summary, the evidence from the pres- 
ent study indicates that the delay-retention 
effect is due primarily to the forgetting of 
interference-producing errors during the 
delay interval and, secondarily, to the in- 
creased time a subject spends studying the 
feedback after a delay. The implications for 
instruction are obvious. One should take 
care that learners have thoroughly mas- 
tered materials before giving them a test. 
Feedback should be delayed for a day or 
two, especially if there is an error rate of 


any magnitude. 
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of 
SOCIAL ECOLOGY OF UNIVERSITY STUDENT RESIDENCES 


RUDOLF H. MOOS 


Stanford University 


The development, initial standardization, and substantive data of the 
University Residence Environment Scales (URES) are presented. The 
URES, a true-false, perceived environment scale composed of 10 sub- 
scales, was found to discriminate among the 74 student residences in 


the current norm group. The URES has high 


internal consistency, and 


both high test-retest and overall-profile reliability. Comparisons be- 
tween dormitories and fraternities and men’s, woman’s, and co-ed dor- 
mitories were presented. The uses of the URES in the areas of program 
evaluation, change processes, and architectural-behavior research were 


discussed. 


While the environment is generally consid- 
ered to be a pervasive and extraordinarily 
powerful influence on behavior, the exact 
specification of environmental or situa- 
tional variables has been relatively ne- 
glected. Early efforts in formulating these 
issues were presented by Murray (1938), 
Lewin (1951), and Mead (1934). In more 
Tecent years, environmental and situational 
variables have been essential to analyses of 
Social systems (Parsons, 1951) and behav- 
lor in face-to-face interactions (Goffman, 
1971). In personality psychology, tradi- 
tional "trait" theories have been increas- 
Ingly replaced by formulations stressing sit- 
aational determinants (Mischel, 1968). 
aiy, theories of therapy and clinical 
a avior change have stressed the impor- 

ince of environmental events in change of 
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deviant behavior and the maintenance of 
more adaptive response systems (e.g., Ban- 
dura, 1968). 

However, with the exception of the work 
of Barker (1968) and his colleagues (e.g., 
Barker & Gump, 1964), empirical attempts 
to specify environmental variables have, 
until recently, been notably absent. In the 
last few years, a number of investigations 
have focused on nonsocial and architectural 
environments (Craik, 1970), the consist- 
ency of personality traits under varying 
situational conditions (Endler & Hunt, 
1968; Hunt, 1965; Moos, 1969), and the re- 
lationship of therapeutic behavior change to 
environmental stabilities (Bandura, 1970). 

Various measures of “perceived” environ- 
ment have been constructed for such diverse 
environments as psychiatric wards (Moos 
& Houts, 1968; Sommer, 1969), correctional 
facilities (Moos, 1968), high school class- 
rooms (Walberg, 1969), and university en- 
vironments (Astin, 1968; Pace, 1969; Stern, 
1970). For example, Stern’s (1970) College 
Characteristics Index (CCI) and the Col- 
lege and University Environment Scale 
(CUES), developed by Pace (1969), were 
designed to measure the environment of col- 
leges and universities by means of true- 
false questions asking students about their 
activities and impressions of the college en- 
vironment. 

Another approach exemplified by the En- 
vironmental Assessment Technique (EAT) 
of Astin and Holland (1961) characterizes 
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educational institutions using student char- 
acteristics as indices of environmental im- 
pact. These characteristics include average 
intelligence and size of the student body 
and six "personal orientations” based on the 
proportions of students in six broad areas of 
study (e.g., scientific, artistic). More re- 
cently, Astin (1968) has developed the In- 
ventory of College Activities (ICA) which 
covers four broad areas of environmental 
“stimuli”; peer, classroom, administrative, 
and physical facilities, 

While these measures represent notable 
advances in the assessment of environments 
and their impact on individuals, particu- 
larly in educational institutions (for com- 
parisons of these methods see Creager & 
Astin, 1968; Feldman, 1970; Pace, 1970), it 
appears quite clear that college environ- 
ments are not monolithic and undifferen- 
tiated (e.g., Pace, 1966) but are composed 
of various subenvironments that may have 
considerable impact in themselves on stu- 
dents and also on the larger college environ- 
ment. 

One such important environment is the 
immediate, on-campus living residence 
(dormitory, fraternity, sorority, etc.) where 
students spend much of their nonclassroom 
time and in which a large proportion of in- 
terpersonal learning and peer influence oc- 
eurs (Feldman & Newcomb, 1969; New- 
comb, 1943; Wallace, 1966). For example, it 
may be that the immediate living environ- 
ment (as distinguished from the general 
college environment) may have significant 
impacts on students in areas such as satis- 
faction with college life, intellectual and ac- 
ademie productivity, changes in subjective 
mood states, and the development of psy- 
chiatric symptomatology. In order that 
these and other questions about the effect of 
the residential environment on students 
could be approached, a scale was developed 
that measures both salient features of the 
residence environment and allows for the 
systematic comparison across a wide vari- 
ety of living arrangements in different col- 
lege and university settings. 

Three methodological approaches can be 
utilized to measure residence environments. 
The ecological approach might include the 
measurement of residence size, sex ratio of 
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residents, student-staff ratio, the number of 
one-, two-, and three-person rooms, eto, A 
behavioral observation method (e.g., Bar- 
ker’s, 1968, behavior setting approach) ‘ 
might focus on types and frequency of var- 
ious activities of residents such as amount 
of time spent together, the attendance at 
house social functions, types of behaviors at 
mealtimes and house meetings, etc. 

A third method, and the one employed in 
the present study, is logically similar to 
that used in the CCI (Pace & Stern, 1958), 
the CUES (Pace, 1969), and the Ward At- 


mosphere Scale (Moos, 1971); this may be 
termed the perceptual approach. Students 
and staff are asked to describe the usual 
patterns of behavior in their living units 
and their perceptions of the house. While 
each person may perceive his environment 
in idiosyncratic ways, there is a point at 
which each individual's private world ; 
merges with that of others so that common 
interpretations of events tend to arise out of 
common experiences. It is this consensual 
perception of the press of the immediate 
environment (in Murray’s (1938) terminol- 
ogy the “beta” press) that the URES was 
developed to measure. 

Each of the above approaches to the 
measurement of environments undoubtedly 
would yield important information about 
the climate of university residences, and 
would be expected to be moderately corre- 
lated with data obtained using other 
methods. The usefulness of the perceptual 
approach may be seen in part by noting 
that the press of the external environment 
(including the behavior of other persons 
and ecological variables) suggests the direc- 
tion a resident/s behavior must take if he i$ 
to function with a minimum of stress and à 
maximum of satisfaction within his particu- 
lar living group. For example, a studenti 
perception of the friendliness or hostility 0 
the environment regarding certain of his is 
haviors will channel his actions as & dur | 
tion of these anticipated rewards and pi 
ishments possible in his living unit. Thes? 
perceptions will, in turn, direct him to es 
ious aspects of the environment suc du 
particular groups or individuals in his d0™ + 
mitory who may, through modeling and "i 
inforcement processes, have an impo 
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impact on his subsequent attitudes, value 
orientations, intellectual curiosity, and 


self-evaluations. 
! Two major questions were asked in the 
present study: (a) Does the psychological 
environment vary from one living environ- 
ment to another, and can these differences 
be measured by the URES? (b) Can the 
psychological environment of a residence be 
described in relatively homogeneous ways 
by persons in that milieu? 


MeErHop 


Several methods were employed in obtaining the 
initial pool of questionnaire items and in gaining & 
naturalistic understanding of dormitory climates. 
First, meetings with groups of dormitory residents 
were arranged to talk about perceptions of their in- 
dividual houses and to discuss with them their 
likes, dislikes, and general observations on dormi- 
tory living. Second, various environmental scales 
were studied to generate additional ideas about 
items that might discriminate between university 
residences. Third, various written accounts were 
searched (e.g., Katz, et al., 1968; Sanford, 1962) in 
an effort to identify differing dormitory atmos- 
pheres and to understand dimensions along which 
university residences would vary. Last, observa- 
tions by university housing personnel were solic- 
ited, and the authors’ own reminisences of their 
college experiences were scrutinized and formalized, 
wherever possible, into items. These sources gen- 
qum an item pool of more than 500 initial ques- 

ons, 

The items were then sorted into categories by 
agreement between the two authors. An initial set 
of categories was selected on the basis of the above 
considerations and rough groupings of the items 
themselves, and also from lists of environmental 
pre from Murray (1938) and Stern (1970), and 
rom the previous work of Moos (e.g., 1968, 1972). 
Sixteen additional items were formulated in order 
n identify individuals who showed a strong posi- 
lve or negative bias in their perceptions of their 
lving residences, 

P The resulting 274-item questionnaire was given 
Q. both student and staff residents in 13 dormitories 
at a private university. These dormitories included 
Aile, female, and coeducational houses, both large 
Fopma units, and houses composed of only 
A arae or only upperclassmen or all four under- 
aduate classes combined. 


Revision of Preliminary URES 


th The total numbers of students and staff tested in 
© 13 dormitories were 455 and 11, respectively. 
ver roximately 55% of the students approached 
veyed useable questionnaires. This percentage 

Aried from 42% to 92%. * 
e first question of interest was to determine 
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whether the items actually discriminated among 
the tested houses. One-way analyses of variance 
were computed among all 13 dormitories for each 
of the 238 environmental items (of the total 274 
items, 20 were Crowne-Marlow SD and 16 were 
halo items). Of these items, 87.9% were significant 
beyond the .05 level with 199, or 83.6% of the total, 
discriminating at the .01 level. These results are 
for the students and staff combined (for this analy- 
sis both groups were combined since the number of 
staff in this sample was quite small, consisting of 
24% of the total N). Of the 238 environmental 
items, 18 or 7.6% had significant (p < .05) cor- 
relations with the total Crowne-Marlow scale, in- 
dicating that item responses by subjects were not 
confounded by social desirability. 

Since it appeared that measures of the perceived 
environment could significantly discriminate among 
different living units, the next step was to select 
items for a revised version of the scale. Criteria 
used in selecting items for the revised (R1) form 
were as follows: First, an item should significantly 
discriminate among the houses tested. Second, 
items should not have true-false response splits 
more extreme than 80%—20% to be descriptive of 
all residences. Third, each subscale should have 
five true-keyed and five false-keyed items so that 
acquiescent responding could be controlled, Last, 
items should not be correlated with the Crowne- 
Marlow scale. 

These four criteria were applied to the item re- 
sponses from the dormitory sample and resulted in 
a 140-item R1 form of the URES composed of 14 
environmental subscales of 10 items (5 true, 5 
false) each. Of these items, 95% ( 133) significantly 
discriminated among residences, and only 6% of 
the items (9) had significant correlations with the 
Crowne-Marlow scale. The fifteenth scale (Halo) 
was constructed from a group of extremely posi- 
tively and negatively worded items using the 
same criteria as above with the exceptions that: (a) 
the item should not discriminate among residences, 
and (b) the items should be endorsed in the keyed 
direction by fewer than 10% of subjects. 

Each of the 15 subscales of the URES (R1 ver- 
sion) were then subjected to one-way analyses of 
variance across the 13 dormitories. All 14 environ- 
mental subscales reliably differentiated among 
houses in the sample, while the Halo subscale did 
not differentiate among the houses tested. 


Revision of the URES: Form R1 

The psychometric properties of the scale, results 
from initial data collection, and enthusiasm from 
feedback of results to dormitory residents and ad- 
ministrative personnel encouraged the authors to 
collect data on a larger number and wider range 
of student residences. The demographic charac- 
teristics of this current T4-residence norm group 

in Table 1. 

picos to this data collection, the decision 
was made to revise the R1 version of the URES to 
(a) reduce the total number of items in the scale, 
(b) reduce the content overlap and seeming re- 
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TABLE 1 


Backerounp ChHanacTERISTIOS or URES 
Norm Group 


‘Type and number of residences Bangs 
us it 
institution | Co-ed | Men's | WOR | Frater. | dence 

dorms | dorms dorms | "ities 

Private uni- 

versity 3 3 7 8 | 25-300 
Religious uni- 

versity 1 1 >200 
State college 2 3 88-210 
Religious col- 

lege 1 2 27-84 
Medical school 1 1 100 
Women's reli- 

gious college 3 55-87 
Fine arts col- 

lege 1 60 
State college 1 160 
Women’s col- 

lege 2 24-156 
State univer- 

sity 7 3 4 8 16-110 
State univer- 

sity® 1 2 150-265 
Liberal arts 

college 2 2 30-60 
State unveri- 

sity 3 | 1 1 60-100 


Note.—All institutions located in California 
except those indicated below. 

* Located in Florida. 

> Located in Ohio. 


dundancy of some items, and (c) reduce the over- 
lap among some subscales (eg., Affiliation, Support, 
Involvement and Interpersonal Openness cor- 
related with one another approximately 7). 

A random sample (n = 505) of students was 
chosen from each house in the norm group with 
selection being made to insure Proportional sex 
and class representation within each floor of each 
residence. A factor analysis (varimax rotation) 
was then performed to provide information about 
possible item clustering other than the & priori 
method initially employed in defining the subscales, 
In general, the factors that emerged in this analysis 
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closely paralleled the R1 subscales, The main ex. 
ception was that the first factor was a combination 
of all the Affiliation and Involvement items and 
five items from Support. Factor VI combined six 
items from Independence and six items from 
Social Propriety, and Factor IV combined three 
items from Support and four from Competition, 
Item intercorrelations, subscale intercorrelations, 
and item-to-subscale correlations were then caleu- 
lated for three successive trials with item deletion 
and subscale recomposition after each trial as indi- 
cated. 

The subscales were reorganized using the criteria 
previously mentioned (i.e., reduction of item and 
subscale overlap and reduction of total scale 
length), and the additional criteria of high item- 
subscale correlation, and maximum discrimination 
of items. This latter criterion was met by comput- 
ing one-way analyses of variance for each item 
across all 74 houses in the norm group and choosing 
items with the most significant F ratios. This 
procedure resulted in a 96 item URES (Form R2) 
grouped into 10 subscales*. Table 2 presents the 
subscales and their definitions. 


TABLE 2 


University RESIDENCE ENVIRONMENT SCALE: 
SUBSCALE DEFINITIONS 


Interpersonal Relationships: The emphasis on 
interpersonal relationships in the house 
eee ee eels A 
1. Involvement (10)"—Degree of commitment 

to the house and residents; amount of social 
interaction and feeling of friendship in the 
house. 4 
2. Emotional Support (10)—Extent of manifest 
concern for others in the house; efforts to ai 
one another with academic and personal 
problems; emphasis on open and honest 
communication. 


Personal Growth: Social pressure dimensions 
related to the psychosocial development 
of residents 
DG JUDA yw at UU. 
3. Independence (10)—Diversity of residents 
behaviors allowed without social sanctions, 
versus socially proper and conformist be- 

havior. 

4. Traditional Social Orientation (9)—Stress Hm 
dating, going to parties, and other “tradi- 
tional” heterosexual interactions. " 

5. Competition (9)—(This subscale is a pues 
between the Personal Growth and Intel 
Growth areas.) The degree to which & des 
variety of activities such as dating, Ek d 
etc., are cast into a competitive framework: 


"The R2 version of the URES is available fo 
research purposes and may be obtained by wi! 
the first author. 


| 
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TABLE 2—Cont. 


Iniellectual Growth: The emphasis placed on 
academic and intellectual activities related 
to cognitive development of residents 


5. Competition—As above. 


6. Academic Achievement (9)—Extent to which 
strictly classroom accomplishments and con- 
cerns are prominent in the house. 

7. Intellectuality (9)—Emphasis on cultural, 
artistic and other scholarly intellectual ac- 
tivities in the house, as distinguished from 
strictly classroom achievement. 


System Change and Maintenance: The degree of 
stability versus the possibility for change of 
the house environment from a system 
perspective 


8. Order & Organization (10)—Amount of for- 
mal structure or organization (e.g., rules, 
schedules, following established procedures, 
etc.) in the house; neatness. 

9. Innovation (10)—Organizational and indi- 
vidual spontaneity of behaviors and ideas; 
number and variety of activities; new ac- 
tivities. 

10. Student Influence (10)—Extent to which 
student residents (not staff or administra- 
tion) perceive they control the running of the 
house; formulate and enforce the rules, con- 
trol use of the money, selection of staff, food, 
Toommates, policies, ete. 


* Number of items in each subscale. 


The subscales are grouped into four categories: 
Interpersonal Relationships, Personal Growth, In- 
tellectual Growth, and System Change and Main- 
tenance. These categories appear to reflect the basic 
areas of concern of college students living in on- 
campus residences, and, as such, are seen as broad 
organizational themes underlying the scale. Simi- 
ar themes have been found in other perceived en- 
vironmental measures such as the Ward Atmos- 
phere Scale (Moos & Houts, 1968) and the Cor- 
ae Institution Environment Scale (Moos, 


RESULTS 


Bubscale Discrimination 


, Each of the 10 URES subscales were sub- 
Jected to one-way analyses of variance 
Across a sample of 13 residences in the cur- 
P norm group to determine whether they 
ifferentiate among residences. Table 3 
Stows that all 10 subscales discriminated 


Very signi in th 
D ona: among the houses in the 
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TABLE 3 


URES SUBSCALE ÁNALYSIS or VARIANCE ACROSS 
THIRTEEN DORMITORIES 


Subscale pe 
Involvement 7.75* 
Support 8.55* 
Independence 16.79* 
Traditional Social Orientation 37.13* 
Competition 2.52* 
Academic Achievement 4.98* 
Intellectuality 6.17* 
Order & Organization 32.72* 
Innovation 12.47* 
Student Influence 7.52" 

*p < .001. 
* df = 12/451. 
Reliability 


The reliability of Form R2 was estimated 
by employing internal consistency, test-re- 
test, and profile stability methods. Table 4 
presents the subscale internal consistencies 
for the original 13 dormitory sample (n = 
466). As can be seen, KR-20 correlations 
range between .76 and .87. This level of 
subscale homogeneity is quite satisfactory 
and remarkably high for scales composed of 
only 9 or 10 items each. 

The temporal stability of individual per- 
ceptions was measured by administering the 
URES to the same subjects on three sepa- 
rate occasions in one men’s and one wom- 


TABLE 4 
URES Internat Consistency,’ AND Tust-Ru- 
TEST RELIABILITIES ACROSS INDIVIDUALS? IN 
Onr Manz Dorm AND Onn FrMALE Dorm 
LL 


Time interval 
Subscale KR-20 Time 1 | Time 1 
versus | versus 
1 week | 4 weeks 
u$ X MD oe lc i ir EE 
Involvement .879 | .740 | .698 
Support .816 | .773 | .710 
Independence vod .772_ | .713 | .592 
iti i rien- 
oce jp .868 | .731 | .742 
Competition .766 | .709 VE 
Academic Achievement .88b | .755 pu 
Intellectuality .836 | .672] . 
Order & Organization .860 | .705 th 
Innovation .766 | .699 d 
Student Influence .805 | .660 | .65 
PIENO TREE A E 
an = 466. 
bn = 83. 
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en's dormitory at a public university. The 
product-moment correlations revealed that 
individuals living in these two dormitories 
perceived their respective environments in 
similar ways both 1 week and 1 month after 
an initial testing. The correlations pre- 
sented in Table 4 range from .67 to .75 after 
1 week and from .59 to .74 after 1 month. 
While there is some decrease in the correla- 
tions from the 1-week to the 1-month test- 
ing, as would be expected, the drop-off is 
quite small indicating adequate individual 
stability of perceptions over 11% of the ac- 
ademic year. 

The third important reliability compo- 
nent for an environmental scale is the sta- 
bility of subscale scores when the residence 
as a whole is the unit of measurement. The 
intraclass correlation derived from the 
analysis of variance was used to estimate 
profile stability 1 week and 1 month after 
the initial testing for the above two dormi- 
tories, and it provided a temporal stability 
index for all 10 subscales. For the men’s 
dormitory, the intraclass correlation is .96 
after 1 week and .86 after 1 month, For the 
women’s house, the stability is similar; .96 
after 1 week and 98 after 1 month. Thus, 
when the perceptions of house residents are 
pooled, the stability of the subjective envi- 
ronment becomes remarkably high. 


Intrahouse Agreement 


The homogeneity of living-unit percep- 
tions by persons within the house was ap- 
proached by computing the percentage 
agreement for each subscale over the origi- 
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nal sample of 13 dormitories. For the 130 
comparisons (13 houses by 10 subscales 
each), 113 are greater than 70%. While 
some variation would be expected (and may 
even itself be indicative of an environmen- 
tal quality), a reasonably high rate of 
agreement by residents in a house should 
obtain and be reflected in environmental 
measures. In general, the URES fared well 
on this criterion and reflected a high degree 
of consensus among residents (a similar 
method is presented by Pace, 1969, who 
used a two-third agreement eriterion for 
scoring the CUES). 


Subscale Independence 


Subseale correlations for the revision 
sample are presented in Table 5. As can be 
seen, most of the subscales are only moder- 
ately correlated with one another and many 
are essentially uncorrelated. The mean of 
all the correlations is .184. The degree of 
overlap, thus appears to be sufficient to con- 
clude that the subscales are measuring as- 
pects of a diverse but unified environment 
while sharing a small enough common vari- 
ance to tap the unique components of the 
residence climate. 

While the mean of all the subscale corre- 
lations are in the moderate range, there are 
certain exceptions that in themselves lend 
some support to the internal validity of the 
URES. The highest positive relationship (r 
= .62) occurred between Support and In- 
volvement, with these two subscales also 
being significantly related to Intellectuality 
and Innovation. Further, Support was nega- 


TABLE 5 
SUBSCALE CORRELATIONS 

— Moet hey To 
1. Involvement 
2. Support .62 
3. Independence —.12 .18 
4. Traditional Social Ori- 
entation —.05 —.001 —.38 
5. Competition —.1 —.33 —.05 19 
6. Academic Achievement] — .09 -08 —.20 —.06 —.07 
7. Intellectuality Al 43 —.03 —.14 —.06 .26 
8. OrderandOrganization|  .19 .24 —.40 .27 —.06 -23  .18 
9. Innovation 57 45 .16 —.15 -—.12 —.18 48 
10. Student Influence .20 17 08 —.13 —.16 -09 16 


Note.—N = 505. 
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PAIR-WISE 
INVOLVEMENT cuite ta 
M=W=C 
SUPPORT *** a 
M=C 
INDEPENDENCE *** ebd 
=C 
TRADITIONAL SOCIAL 
ORIENTATION Æ K Cay 
COMPETITION*® Wecem 
ACADEMIC Uo 
ACHIEVEMENT vet 
INTELLECTUALITY "E M«W«c 
ORDER AND - 
ORGANIZATION * ** new 
INNOVATION" SK M-W«C 
STUDENT INFLUENCE M-W-C 
UM Eos ir Haeo BEAN eius nens EP 
02 3 4 5 6 7 8 9 10 
SUB-SCALE SCORE 
Overall F Men's Dorms (M) @——® (n=12) 


% p<.05 
363 p<.01 


3é3€3€p«.001 Co-ed Dorms (C) 


Women’s Dorms (W) &--*& (n=28) 
O--—O (n=15) 


Fra. 1. URES profile comparisons of men's, women's, and co-ed dormitories, 


Eur correlated with Competition. Thus it 
ot that residences that are seen as in- 
be sonally involving are also seen as sup- 
e innovative, and intellectual. In an- 
a : area of the social climate, houses that 
ssa een as having considerable Independ- 
m c also seen as having little Order & 
tio Zation and as being low in Tradi- 
nal Social Orientation. 


Nlustrative Results 
Mus URES profiles from the normative 
ilis of 74 residences are presented below 
lig some of the utility of the 
and to present substantive data 


about the psychosocial environment of stu- 
dent residences. 


Residence Profiles 

Profiles can be constructed that show the 
average perceptions of a residence group or 
any subgroup within a house. Figure 1 pre- 
sents the perceptions of student residents in 
28 women’s, 15 co-ed, and 12 men’s dormi- 
tories. Women see their houses as emphasiz- 
ing general psychological support, interper- 
sonal and house involvement, and as stress- 
ing more traditional culturally valued be- 
haviors than do men’s houses. On the other 
hand, men perceive their houses as stressing 
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competitive and nonconformist qualities 
more heavily than the women’s houses. 
While “conventional wisdom” of men’s and 
women’s dormitories seems to receive some 
support inasmuch as women’s dorms are 
seen as having more interpersonal and so- 
cially traditional concerns, while men’s 
dorms de-emphasize these equalities, none- 
theless, some unexpected similarities be- 
tween men’s and women’s dorms do emerge. 
For example, men’s dorms are generally de- 
scribed as more spontaneous and academi- 
cally achieving than women’s dorms. In the 
present sample, assessment by the URES 
does not reveal such differences. 

On the other hand, co-ed houses are per- 
ceived as possessing as much Support and 
Involvement as women’s houses, and as 
much Independence and Traditional Social 
Orientation as men’s residences. They are 
also seen as low in Competition and Order 
& Organization but high in Innovation and 
Intellectuality. 

Tt is interesting to note that residents of 
co-ed houses perceive their environments as 
stressing personal concern, involvement, 
mutual support, and a high degree of both 
independence and achievement. While this 
finding in itself may be significant in the 
assessment of these different living arrange- 
ments, a further important question is 
whether these environmental differences are 
due to preselection of student residents, the 
results of the living experience itself, or 
whether it is an interaction. Further studies 
are planned to elucidate this process. 


Intrahouse Comparisons 


Within any residence, various subgroups 
may differentially perceive the environ- 
ment, and this may in turn influence the 
overall level of satisfaction or conflict in the 
house and provide clues to the locus of such 
strain. One example of such subgroup com- 
parisons is the perceptions of male and fe- 
male students living in the same co-ed resi- 
dence. Other interesting comparisons could 
be made for students versus Staff, senior 
versus freshman students, new versus old 
residents, etc. A sample of three co-ed 
houses (n = 195) tested in one private uni- 
versity were compared to provide an illus- 
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tration of the potential of the URES in this 
area. 
In these three co-ed dormitories, the men 


and women perceive the house environment — 


almost identically with no significant dif- 
ferences emerging on any subscale. One rea- 
son contributing to the close congruence of 
perceptions in these three houses may be 
attributed to the fact that co-ed housing 
was in its fourth year at the university 
sampled, and this may have allowed suffi- 
cient time for a set of “cultural” norms to 
be established and transmitted to new resi- 
dents. Thus potential disparities of attitude, 
perceptions, and behavior of both sexes 
could be accommodated within an overarch- 
ing set of values. 

An alternative hypothesis is that since 
students living in the relatively few co-ed 
houses then available on this campus were 
self-selected, they entered with similar ex- 
pectations rather than these attitudes and 
perceptions being shaped by the living envi- 
ronment. It would be quite interesting to 
make similar comparisons at institutions 
that were in their first year of co-ed living 
arrangements, and where the student's 
housing choices were more restricted. 


Comparison of Dormitories and 
Fraternities 


An important use of the URES may be in 
comparing different residence philosophies 
as reflected in the type of programs and 
residence organizational structures devel- 
oped at various institutions. Not only can 
the pervasive dormitory-íraternity dichot- 
omy be compared as below but also resi- 
dences with various programs can be evalu- 
ated and contrasted to other such experi- 
ments. Figure 2 presents the profiles 0 
three men’s dormitories and eight fraterni- 
ties on the same campus. By only compat 
ing residences drawn from the same institu- 
tion, personality characteristics of individu 
als are less likely to emerge as a significant 
component than if houses from different 1 
stitutions were contrasted. : 

While differences in the System Chang 
and Maintenance area and in Tradition 
Social Orientation and Competition ee 
be expected (e.g., Scott, 1965), it is interes 
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INVOLVEMENT??? 
SUPPORT 
INDEPENDENCE 
TRADITIONAL SQCIAL 
ORIENTATION? EXE 
COMPETITION® * 
ACADEMIC 
ACHIEVEMENT* 
INTELLECTUALITY 
ORDER AND 
ORGANIZATION? 


INNOVATION* ** 


STUDENT INFLUENCE * 
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SUB-SCALE SCORE 


* pX.05 Fraternities O--O(n=8) 


eae in the Interpersonal Relationship 
a <a es see their houses as having 
Sup nvolvement and marginally more 
ee than the men’s dormitories. These 
i m be the joint effect of three vari- 
cud irst, since fraternities select future 
pasi and initiate them, the degree of 
Du and group cohesion may be en- 
ign, (Festinger, 1957). Second, this 
ooi a process tends to increase the likeli- 
fucus at members are similar in val- 
dot erests, and attitudes with the existing 
basa which may lead to greater in- 
a at attraction among members and 
Peay her increase group cohesion and or- 
ational loyality (Newcomb, 1961). 


Men's Dorms @—@ (n=3) 


Third, since the mean size of the fraterni- 
ties (X = 35) is smaller than the dorms (X 
= 47) in this sample it-is possible that more 
face-to-face interaction and mutual influ- 
ence occurs in the fraternity than in the 
dormitory. This process may also be en- 
hanced by “ecological” processes such as 
the physical, “home-like” design of the 
house (Van der Ryn & Silverstein, 1967) 
versus the more institutional architecture of 


the dorms. 
Individual House Comparisons 
Figure 3 compares two individual houses 


normed against the 74 residences in the 
present sample. Individual profiles such as 
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+18 
STANDARD SCORES 


Medical Students Dorm @—® (n=14) 
Co-ed Theme Dorm O--O (n=48) 


Fre. 3. URES profile comparisons of two houses normed against the 74-house standardiza- 


tion group. 


these may be used for “feedback” to partic- 
ular residences and can serve as the basis 
for discussions aimed at making specific 
changes in house atmosphere by the resi- 
dents themselves. In order to illustrate the 
wide differences between individual univer- 
sity residences, an undergraduate, co-ed, 
theme house, and a medical student men’s 
house were contrasted. 

Programmatically, the theme house was 
organized around the area of international 
relations. There was a great stress placed on 
intellectual discussions of world problems 
and an active program of invited speakers, 
and new activities were continually being 
generated in the house. Informally, the fac- 
ulty advisor (who lived in the house and 
was a strong influence) indicated that he 
wanted the students to be the intellectual 
and academic elite of the university. 

By contrast, the medical students ap- 
peared to be quite disinterested in any ac- 
tivities not directly related to their aca- 
demic pursuits. Frequent comments were 


made to the test administrators about their 
house being more like a hotel than a dormi- 
tory. Many people said they did not even 
know the names of their next door neigh- 
bors and in response to the scale Ee 
*People here never talk to one another 8 
mealtimes,” an astounding 80% &nswe i 
affirmatively. In the Interpersonal pu 
ship area, there was almost a 5 ers 
deviation spread between the houses. ; 
were also very large differences on such su 
scales as Competition and Intellectuality. 

The theme house appeared as Es T 
warm, innovative, and intellectual dr 
while the medieal students described m 
house as having a very unsupportive, her] 
petitive, and achievement-oriented envi" 
ment. 


Discussion 


i en 

In the past, when comparisons exis e i 
made between dormitories, fraternities, / 

other student living groups, the 
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used were either readily observable indic 


dimensions ; 
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such as number of students, number of staff, 
amount of floor space per student etc., or 
naturalistie observations codified into case 
study vignettes (exceptions to this are 
Scott, 1965, and Centra, 1967). The results 
from the URES demonstrated that the per- 
ceived social-psychological climate could 
be reliably measured and thus aid in the 
systematic description and comparison of 
university residences. The psychometric 
and conceptual properties of the scale en- 
courage its use in a number of research 
directions, some of which are summarized 
below. 


Programmatic Evaluation 


The URES may be an effective tool in 
the evaluation of the impact on students of 
programmatic and compositional innova- 
tions. For example, many universities are 
currently instituting “living and learning” 
dormitories where much of the traditional 
class and seminar teaching is integrated into 
the residence with faculty members often 
living in the house. Other colleges and uni- 
versities are establishing experimental living 
arrangements such as co-ed housing and bi- 
ethnic dormitories where 2095-5095 of the 
residents are students from minority groups 
currently entering universities in significant 
numbers. 

In both of these areas, a primary empiri- 
Cal issue concerns the extent and type of 
Impact on the member of the living unit. 
The URES may be useful in providing 
One type of evaluative information in as- 
Sessing the adequacy of existing programs 
and pointing directions for additional 
changes. 


Change in Residence Climate 


While programmatic innovations may ef- 
fect changes in the environment of a student 
Tesidence, student-initiated change may be 
More effective and provide a richer inter- 
Personal learning experience. Such inter- 
nally generated changes (via encounter 
Stoups, student projects, etc.) may be as- 
sessed by the URES, and more interestingly, 

scale itself may be incorporated in a 
change program, There is some evidence 

Moos & Otto, 1972; Pierce, Trickett, & 
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Moos, 1972) that people’s knowledge of 
their own environment may be a powerful 
tool in enabling them to plan and imple- 
ment changes along desired dimensions. 

The URES feedback may take a variety 
of forms. For example, a comparison illus- 
trating for residents their perceptions of an 
"ideal" house versus their perceptions of 
their actual living situation may be used as 
a basis to plan change strategies to reduce 
the real-ideal discrepancies, Further, a 
comparison of the residence perception by 
staff and students could make clear to each 
the areas of conflict, confusion, and con- 
tradictory expectations of their shared envi- 
ronment, and thus enhance the possibility 
of cooperative change efforts. 


Individual Impact 

The effect of the immediate social envi- 
ronment on individual student development 
may also be approached using this instru- 
ment. For example, the manner in which a 
student perceives the social climate of his 
residence may influence his subjective mood 
states such as feelings of depression, aliena- 
tion, and isolation. Furthermore, a student/s 
satisfaction with his residential environ- 
ment may influence his perception of him- 
self and his overall college experience so 
that pursuit of relationships with others 
and the degree of involvement in intellec- 
tually and emotionally significant activities 
may be affected. 


Architectural and Design Influences 

While large sums of money have been 
spent on the design and construction of stu- 
dent housing, only sporadie attempts to as- 
sess the impact on their users have been 
made (e.g., Avery, 1971; Proshansky, Ittel- 
son, & Rivlin, 1970; Sommer, 1969; Van der 
Ryn & Silverstein, 1967). For example, it 
may be that student residences that are de- 
signed in small clusters of rooms around a 
central courtyard are perceived as having 
more Support and Involvement than dormi- 
tories arranged in straight-line corridors. It 
may be possible that the psychological and 
behavioral consequences of variations in ar- 
chitectural planning can be approached 
using the URES as a measure of the psy- 
chosocial atmosphere. 
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Person-Environment Interaction 

The URES and other environmental as- 
sessment instruments such as the WAS, the 
CCI, CUES, and the ICA have implications 
for the assessment, prediction, and modifi- 
cation of behavior. As trait theories of per- 
sonality have been replaced by interactive 
theories, the necessity for the measurement 
of environmental settings in which behavior 
occurs has increased (e.g., Mischel, 1968). 
Not only must situational variables be 
specified more exactly but also the bounda- 
ries and common elements of various envi- 
ronments must be delimited. When the en- 
vironmental regulators of behavior are 
more fully documented it may become pos- 
sible to delineate the interpersonal skills 
appropriate to subsets of common environ- 
ments (Goldfried & D’Zurilla, 1969) and to 
enhance an individual’s coping skills neces- 
sary for acceptable behavior in particular 
interactive domains. 
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EFFECTS OF MEANINGFUL AUDITORY STIMULATION ON 


CHILDREN’S SCHOLASTIC PERFORMANCE! 


HOWARD KASSINOVE* 
Hofstra University 


The purpose of this study was to investigate the effects of meaningful 
auditory stimulation on children’s scholastic performance. Forty third- 
grade and 40 sixth-grade elementary school children worked independ- 
ently on a series of self-paced addition or division problems, either in 
quiet or in the presence of several types of auditory stimuli. Task 
performance was evaluated in terms of mean time per response, varia- 
bility of response time, probability of error, number of correct re- 
sponses, number of “time-outs,” and changes in these behaviors over 
time. Results indicated that the auditory stimulation in no way af- 
fected task performance. There was neither an overall effect nor an 
effect over time. It was argued that great amounts of money should 
not be spent by either schools or parents in order to eliminate moder- 


ate amounts of noise from a child’s academic environment. 


Much of the existing research on the ef- 
fects of auditory stimulation on behavior 
has used adult subjects, artificial condi- 
tions, esoteric tasks, and noises of a level or 
kind not likely to be encountered on a con- 
tinuous basis by the typical elementary 
school child. These investigations have 
yielded conflicting results. Detrimental ef- 
fects of noise have been reported by Obata, 
Morita, Hirose, and Matsumato (1934), 
Smith (1951), Broadbent (19582, 1958b), 
Winnick and Lerner (1963), Baker and 
Madell (1965a, 1965b), and Turnure 
(1970). Improved performance has been re- 
ported by Ellis, Hawkins, Pryer, and Jones 
(1963) and Turnure and Zigler (1964). 
Tinker (1925), Hovey (1928), Guertin 
(1959), and Park and Payne (1963) re- 
ported no effects of noise, either positive or 
negative. 

Only one investigator has specifically 

* This article is based on a dissertation submitted 
in partial fulfillment of the requirements for the 
PhD degree at Adelphi University, Appreciation 
is extended to Patrick Ross, Sonia Osler, and 
Michael Merbaum for their generous assistance 
during all phases of this study, 

* Requests for reprints should be sent to Howard 
Kassinove, Department of Psychology, Hofstra 
meten Hempstead, Long Island, New York, 
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studied the effects of noise on academic per- 
formance in children. Slater (1968) had 
seventh-grade pupils complete an achieve- 
ment test under quiet, average, and noisy 
conditions. In a classroom condition, audi- 
tory stimulation was provided by various 
machine noises, noises of locker usage, talk- 
ing, whistling, laughing, etc. In an experi- 
mental condition, white noise of 50 or 80 
decibels was provided for the quiet and 
noisy conditions, respectively. Her results 
indicated no significant difference in either 
Speed or in accuracy of performance, 88 & 
function of noise intensity, or in sex of the 
children. She pointed out, however, the need 
to study the effects of noise over time an 
the possible effects of noise on tasks of 4 
different kind. 

The major objective of this study was to 
extend Slater’s findings by investigating the 
general and temporal effects of loud, contin- 
uous, meaningful auditory stimulation, 0 
the level and kind likely to be found in à 
child's environment, on self-paced arith- 
metic performance. d 

A second objective was to investigate cer- 
tain additional parameters that may play ê 
part in determining individual differences 7 
reactions to noise. These included grade an? 
ability level, task difficulty level, and nU 
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ber and discriminability of irrelevant noise 
conditions. Two elementary school grade 
levels were chosen since Slater (1968) found 
no noise effects at Grade 7, while Turnure 
(1970) reported detrimental noise effects in 
nursery school children. It was hypothe- 
sized, therefore, that a noise effect would be 
more likely to occur in younger rather than 
older children. It was also decided to inves- 
tigate the effects of level of student ability, 
since Baker and Madell (1965a, 1965b) 
found a significant effect of this variable in 
their work with male college students. They 
found underachievers to be more susceptible 
to distraction than achievers. In order to be 
realistic, multiple auditory distractors were 
included since, in the real world, one is 
often faced with more than one auditory 
distraction while trying to concentrate on a 
task. 

It was hoped that by investigating the 
effects of these independent variables on 
self-paced arithmetic performance, as meas- 
ured by a series of dependent variables, we 
would be able to specify further the rela- 
tionship that exists between pupil perform- 
ance and meaningful auditory distractors. 


METHOD 
Subjects 


, The subjects consisted of 40 third-grade and 40 

sixth-grade white, elementary school children. They 
Were chosen from a suburban school system on the 
pasis of availability and on the basis of meeting 
independent variable criteria described below. An 
equal number of boys and girls was used. Re- 
Peaters and special-class children, as well as chil- 
dren with obvious hearing deficiencies, were ex- 
cluded. Once selected, all subjects completed the 
entire testing session. 


Independent Variables 


The effects of four independent variables were 
Studied, 
s Meaningful auditory stimulation effects. Chil- 
ten performed under one of five conditions: (a) 
no auditory stimulation; (b) stories—Danny Kaye 
r e. series of Hans Christian Andersen fairy 
tex, (c) musio—popular recordings varying in 
empo; (d) less discriminable multiple auditory 
stimuli—musie and fairy tales simultaneously pre- 
Sented from the same source; and (e) more dis- 
"Hminable multiple auditory stimuli—music from 
one physical source and fairy tales from an oppos- 
mg physical source. 

Grade level, Half the subjects were third graders 
and half were sixth graders. 
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Achievement level. Prior to testing, the reading 
and arithmetic sections of the Wide Range 
Achievement Test (Jastak & Jastak, 1965) were 
administered to each subject. A mean of the two 
Scores was taken, and each child was classified as 
above or below grade expectation, Grade expecta- 
tion represented the expected level of achievement 
in months, and varied with the month of testing, 
Computation of expectation was performed ac- 
cording to standard procedures found in the test 
manual. 

Task difficulty level. Children were randomly 
assigned to one of two task difficulty groups. After 
consultation with school personnel, an “easy” and 
a "difficult" series of arithmetic problems, all of 
which could be done by third or sixth graders under 
optimal conditions, was generated. The easy series 
represented simple problems capable of “auto- 
matic” solution. The difficult series was relatively 
more complex and required more cognitive effort 
for solution. Two grade levels were used, and two 
different series of problems were generated at each 
level. For the third graders, the easy problem set 
required the addition of two one-digit numbers, all 
of which summed to between 6 and 9, and for 
which no “carrying” operation was therefore re- 
quired (eg, 4 +3 =; 6 + 1 =; 5 +2 =). The 
difficult stimulus set consisted of problems requir- 
ing the addition of a two-digit number to a two- 
digit number. All required one carrying operation, 
and summed to between 31 and 99 (eg. 38 + 
14 =; 27 + 35 =). For the sixth graders, division 
problems were generated. The easy set consisted 
of a two-digit number with a one-digit divisor. The 
divisor varied between 3 and 9. The difficult series 
consisted of a four-digit dividend varying between 
2,111 and 5,999. The divisor varied between 26 and 
99. Numbers with zeros were excluded in all of the 
above problem sets. Only division problems in 
which the dividend was divisible by the divisor 
without a remainder were used. $ 

A 5 X 2 X 2 design was employed (Kind of 
Auditory Stimulation X Achievement Level X 
Task Difficulty Level) at each of the two grade 
levels. 


Procedure 


All children were tested individually. They 
worked alone in a room, separated from the experi- 
menter by & screen, for 45 minutes under one 
auditory stimulation condition and one difficulty 
level condition. Problems were presented in large 
primary school type on 5 X 8 inch note cards. There 
was one problem per card, and each child was pre- 
sented with enough cards to keep him busy for 45 
minutes. Children were asked to copy each prob- 
Jem on to answer sheets and then to compute the 
answer. No feedback about correctness of the 
child’s response was given. The meaningful audi- 
tory stimulation was delivered directly into the 
room by two identical Craig cassette tape re- 
corders. The intensity level was set by use of & 
sound level meter (General Radio Company, Con- 
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cord, Massachusetts, Model 1551-C). All stimuli 
were set so that when combined with the ambient 
noise level in the room they varied between 70 and 
80 decibels. 

The subjects worked at a spacious desk contain- 
ing a light, paper, and pencils. Directly in front of 
them was a shield with a one-way mirror through 
which the experimenter observed the child's re- 
sponses. The entire experimental area was closed 
off by a large screen, and the subjects thought 
they were alone in the room. 

Each child, selected by the teacher at her discre- 
tion, was brought into the testing room and asked 
to take a seat at the table. Following the establish- 
ment of rapport, the giving of directions, and a few 
practice trials, the examiner closed the screen and 
left the child alone in the immediate testing en- 
vironment. 

The following directions were given to each 
child: 

I want to see how well you can do arithmetic 

examples. In front of you there are a lot of 

examples for you to add (divide). When I say 
go, copy the problems, one at a time, into the 
boxes on the paper. Write down the problem 
and then find the answer. When you finish one 
page go on to the next. Do not go back once you 
have finished the problem. [Additional instruc- 
tions were given to the groups receiving auditory 
stimulation: “Pay no attention to the 

Keep working on the problem."] Let's try a few 

for practice, O.K.? [Then the experimenter al- 

lowed the subject to try three or four problems.] 

I'm going to leave and you will have to work 

for a long time by yourself. Are there any ques- 

tions? Ready, go! 


While the child was working the experimenter 
observed his behavior through the one-way mirror. 
A stopwatch was used for the recording of time, 
and time was recorded to the nearest second. 


Dependent Variables 


Since there was no evidence to indicate that any 
particular measure would be the best reflection of 
the effects of the independent variables, a number 
of dependent variable measures were calculated 
for each subject. These included the mean time per 
response, the variability (standard deviation) of 
response times, the number of correct Tesponses, 
the probability of error (number wrong divided by 
number attempted), and the number of time-outs 
taken. A time-out represented a period during 
which the subject turned away from the task and 
seemed to tune out for 2 seconds or more. The 2- 
second-or-more criterion had to be associated with 
a qualitative judgment that the subject was un- 
involved with the task. 

In addition, change in the time per response 
during the 45-minute period was evaluated by 
calculating the slope of the regression line relating 
time per response and time into the task. A posi- 
tive slope indicated response time increased as time 
passed, while a negative slope indicated response 
time decreased as time passed. 
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In order to evaluate the effect of time on the 
probability of error, the number of time outs, and 
the number of correct responses, the 45-minute 
period was divided into three 15-minute sections, 
These behaviors were calculated separately for 
each of the three periods, 
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The results of each grade level were eval- 
uated independently by three-way factorial 
analyses of variance. Effects over time were 
evaluated by dividing the 45-minute period 
into three 15-minute sections and comput- 
ing Lindquist (1954) Type 3 mixed analy- 
ses of variance. An exception was the anal- 
ysis of time per response over time, which 
was evaluated by an analysis of variance 
on the above mentioned slope constants, 


Grade 8 


There was no indication of a significant 
main effect or interaction as a function of 
the meaningful auditory stimulation E 
measured by any of the dependent varia- 
bles. As would be expected, however, the 
analyses of variance indicated a significant 
main effect of problem difficulty level for 
mean time per response (F = 10.20, df = 
1/20, p < .01), probability of committing 
an error (F = 9.60, df = 1/20, p < 01), 
and number of correct responses (F = 
39.30, df = 1/20, p < .01). This provided 
validity in the sense that the difficult prob- 
lems were really more difficult since they 
took longer to solve and were more likely to 
be solved incorrectly. There was no signifi- 
cant source of variation with respect to var- 
iability of response time or number of 
time-outs taken. 

With regard to temporal changes over the 
45-minute period, the analyses indicated no 
effect of time period for number of correct 
responses or for slope constants. A apa 
cant main effect of time-outs was found ( i 
= 8.03, df = 2/72, p < .01). The number 0 
time-outs taken increased among the three 
15-minute segments of the task. The tempo- 
ral analyses for probability of committing 
an error indicated a significant interaction 
between difficulty level and time period 
— 400, df — 2/72, p « .05). For the easy 
problems the probability of committing ja 
error was consistently low, while for 
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difficult problems the middle time period 
was the one associated with the lowest 
probability of error. 


Grade 6 


At the sixth-grade level, meaningful au- 
ditory stimulation also was not a significant 
source of variance for any of the dependent 
variables. Here again, level of difficulty sig- 
nificantly affected variability of response 
time (F = 54.27, df = 1/20, p < .01), prob- 
ability of committing an error (F = 17.53, 
df = 1/20, p < .01), number of correct 
responses (F = 97.22, df = 1/20, p < .01), 
and mean time per response (F = 105.70, df 
= 1/20, p < .01). This supported the inter- 
pretation that the problems at the sixth- 
grade level were validly of different diffi- 
culty levels. A main effect of achievement 
level was noted for mean time per response 
(F = 6.71, df = 1/20, p < .05) and number 
of correct responses (F = 5.35, df = 1/20, p 
< .05) supporting the fact that the children 
were of different ability levels. 

_In terms of performance over time, no 
significant effect was noted in regard to 
slope constants, variability of response 
time, or probability of error. With regard to 
number of time-outs, a significant triple in- 
teraction was found among time period, 
level of achievement, and difficulty level (F 
= 3.56, df = 2/39, p < .05). The maximum 
number of time-outs was taken by above- 
expectation achievers working on difficult 
problems in the last time period. 


Individual Differences and Qualitative 
Remarks 


In spite of the fact that there was no 
overall effect of the auditory stimulation, 
obvious individual differences were noted. 

For example, two of the better-achieving 
third-grade subjects working on the difficult 
Problems differed markedly in their respon- 
ses to the stories being played. One of them 
Seemed to be listening intently to the vary- 
ing stories and was obviously tuning in to 

em on various occasions. His performance 
Was slow and poor. The other child seemed 
to pay absolutely no attention to them and, 
when questioned later, could give no details 
of any of the stories. 
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The quality and quantity of time-outs 
taken by subjects in this experiment also 
varied greatly. Some children took no 
time-outs at all, while others took up to 20 
or more time-outs during the 45-minute 
testing session. While most of the time-outs 
appeared to be from 2 to 5 seconds in dura- 
tion, a small minority was up to 5 minutes 
in length. In addition, a number of different 
behaviors were engaged in during the time- 
outs. Some children looked in the mirror 
and made faces, others seemed to attend to 
the auditory stimulation, others looked 
around the room, and still others just 
seemed to stare into space. The quality and 
quantity of time-outs bore little relation- 
ship to most other performance measures. 

The extreme variety of behavior engaged 
in during the time-outs, as well as the vari- 
ations in responses to the auditory stimuli 
lead to the conclusion that individual dif- 
ferences in ability to focus on the task for 
45 minutes accounted for much of the be- 
havior exhibited in this experiment. Under 
all noise conditions some children seemed 
distracted by the noise, but the distraction 
was apparently equal across conditions. An 
equal amount of distraction by nonnoise en- 
vironmental stimuli (e.g., the mirror, the 
child’s own clothing) seemed to occur in the 
quiet condition. This accounts for the lack 
of a significant difference among meaning- 
ful auditory stimulation conditions. 


Discussion 


The primary finding of the present inves- 
tigation was that elementary school chil- 
dren of the ages studied were quite capable 
of performing at an adequate level in the 
face of various kinds of irrelevant auditory 
stimulation in the range of from 70 to 80 
decibels. There seemed to be little if any 
effect of meaningful auditory stimulation 
on speed or accuracy of response, a8 pre- 
viously shown by Slater (1968) and sup- 
ported in the present experiment. In addi- 
tion, there seemed to be no effect on varia- 
bility of response time, tendency to tune out 
from the task, or changes in any of these 
variables over time. 

Under all conditions in the present study 
some children tuned away from the task for 
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some portion of the 45-minute experimental 
period. If noise was present in the environ- 
ment, many of them attended to it. If not, 
they focused on nonnoise environmental 
stimuli. It seems reasonable to conclude 
that when elementary school children are 
asked to work on a repetitive scholastic 
task for a 45-minute period, they are likely 
to focus on the task for less than 100% of 
the time. During nonfocusing periods they 
may attend to a variety of stimuli, includ- 
ing auditory stimuli, if present. 

Based on the data from the present study 
and from the Slater study, it would seem 
reasonable to suggest that schools should 
not go out of their way to try to sound-con- 
dition classrooms in an attempt to increase 
achievement levels as previously suggested 
(Conrad & Gibbins, 1963). The evidence 
seems to indicate that while noises in excess 
of 100 decibels are detrimental to perform- 
ance, noises in the range of from 70 to 80 
decibels, which subjectively seem loud but 
not intense, do not seem to cause perform- 
ance deficits. It would also seem logical to 
conclude that when classes are broken up 
into small groups, with the teacher working 
with one of these groups and the others 
working quietly at their seats, there is no 
detrimental effect, of the noise coming from 
the group on the other students. 

Many children claim that they can do 
their homework with the radio on. In one 
condition in the present experiment, chil- 
dren performed with rock and roll music 
playing, and performed no better or no 
worse than children in the other conditions. 
The present study gave no indication that 
children should not be allowed to do their 
homework with the radio on. 

Susceptibility to the effects of noise seems 
to be more a function of individual charac- 
teristics of the child than of any of the var- 
iables under investigation in the present 
study. It would seem wise to conclude that 
moderate to loud noise does not affect aca- 
demic behavior to any significant degree. 
Complaints that noise interferes with edu- 
cation would have to be supported by ex- 
perimental data before schools or parents 
should be coerced into spending vast sums 
of money to eliminate such noise. 


HOWARD KASSINOVE 
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STRUCTURAL VARIABLES THAT DETERMINE 
PROBLEM-SOLVING DIFFICULTY IN 
COMPUTER-ASSISTED INSTRUCTION' 


ELIZABETH F. LOFTUS' ax» PATRICK SUPPES' 
Institute for Mathematical Studies in the Social Sciences, Stanford University 


The research examined the problem-solving performance of 16 sixth- 
grade students from a depressed area. The students were first, taught 
the mechanics of how to use a computer-based teletype to solve arith- 
metic word problems. Following the initial instruction set, a series of 
100 word problems was presented to the students. The solutions of 
these problems were analyzed to determine the variables related to 


problem difficulty. A linear regression analysis 


is showed that a word 


problem is difficult to solve if (a) it is of a different type from the 
problem that preceded it, (b) its solution requires a large number of 
different operations, (c) its surface structure is complex, (d) it has a 
large number of words, or (6) it requires a conversion of units. 


There exists a great diversity of ap- 
proaches to the investigation of human 
problem solving, and a wide range of mate- 
rials, techniques, and “problems” has been 
used for such study. Subjects have been re- 
quired to solve anagrams, matchstick prob- 
lems, water-jar problems, pendulum prob- 
lems, concept-identification problems, anal- 
ogy problems, number-series problems, or 
arithmetical word problems, to name but & 
few. Although several theoretical formula- 
tions have been offered and many facts 
have been discovered, there is still no single 
adequate theory into which they can be in- 
tegrated. In addition, there is little analysis 
of why arithmetic word problems, specifi- 
cally, are difficult for students. We know 
that students have difficulty in solving word 
Problems. The present study was an at- 
tempt to find out why. It is an attempt to 
explore the notion that in solving a set of 
Word problems, certain items are more diffi- 
cult to solve than others, and to understand 
What structural variables cause some word 
Problems to be hard and others to be easy. 
> 
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The term structural indicates that the focus 
of attention is on the variables that charac- 
terize the specific problems themselves (e.g., 
the number of words in the problem), and 
on the variables that characterize the rela- 
tionship between individual problems (e.g., 
the structural similarity of two adjacent 
problems). 

One aspect of our research was unique to 
investigations of problem solving. It was 
conducted in the context of a computer-as- 
sisted instruction system developed by the 
Institute for Mathematical Studies in the 
Social Sciences (IMSSS) at Stanford Uni- 
versity over the last 8 years. The research 
reported continues the investigations begun 
in Suppes, Loftus, and Jerman (1969). 

A computer program was used to teach 
sixth-grade students how to solve arithme- 
tic word problems on à computer-based tele- 
type. Assuming that students had a basic 
understanding of the four arithmetical op- 
erations (addition, subtraction, multiplica- 
tion, and division), we asked them to tell 
the computer which operations to use so 
that the actual computations were done by 
the computer. Following the initial instruc- 
tion set, a series of 100 word problems was 
presented to the students. A simple example 
of such a problem was: 4 bushel of corn 
weighs 56 pounds. How much do 44 bushels 
weigh? 

A search of the literature reveals a few 
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studies on the effects of content of arithme- 
iic word problems (Travers, 1967; Wash- 
burne & Morphett, 1928), a few on the ef- 
fect of language used in the problem 
(Hydle & Clapp, 1927; Steffe, 1967), and a 
few on the effects of readability (Thomp- 
son, 1967). A handful of others that have 
been particularly relevant to our choice of 
variables is discussed later. Many more de- 
tailed studies dealing with specific struc- 
tural variables are needed for the develop- 
ment of a general theory. The present study 
was meant to be a modest contribution in 
this direction. 


Tue THEORY 


For the word problems analyzed in this 
paper, the main task was to identify the 
factors that contribute to the difficulty of 
an item. Exactly how each factor is defined 
is a matter that we discuss below. We shall 
attach weights to the various factors and 
then use estimates of the weights to pre- 
dict the relative difficulty of individual 
items, 

To formulate linear structural models 
from which parametric predictions of rela- 
tive difficulty can be made we need some 
notation. Let the jth factor of problem 7 in 
the set of problems be denoted by Xy. The 
statistical parameters estimated from the 
data are the weights attached to the factors. 
We denote the weight assigned to the jth 
factor by aj. We emphasize that the factors 
identified and used in the model presented 
in this paper are always objective factors 
independent of response data. The defini- 
tions of all the factors used in the analyses 
are straightforward; each factor has an in- 
tuitive and direct relevance to common- 
sense ideas of difficulty. 

Consider the analysis of the response 
data. For a given problem 1, let p, be the 
Observed proportion of correct responses for 
a group of students. The main task of a 
model is to predict the observed proportion 
Pı. The natural linear Tegression model in 
terms of the factors X; and the weights a, 
is: 


Di = MoXg + a. 
To guarantee preservation of probability, 
that is, to insure that predicted p/s will 
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always lie between 0 and 1, we make the 
following transformation and define a new 
variable 2,:* 
a — Di) 
z = log ——*“. 
g Pi 


(1) 


We then use as the regression model 
a = DyajXiy + ao. (2) 

The rest of this section is devoted to dis- 
cussion of how each variable used in the 
regression analysis is defined. 

We consider two types of variables. Vari- 
ables of the first type are 0,1-variables, A 
0,1-variable is appropriate if, for example, 
we are dealing with a problem that requires 
a conversion of units, such as from days to 
weeks. If a problem requires such a conver- 
sion, the conversion variable for that prob- 
lem receives a value of 1. If no conversion is 
required, the conversion variable is given a 
value of 0. Variables of the second type 
assume a finite set of values, with the set 
being greater than 2. Such a variable is ap- 
propriate if we are concerned with the 
length of a problem; the length variable is 
given a value which is equal to the number 
of words in the problem. 

Three other variables of the second type 
are the operations variable, the steps vari- 
able, and the depth variable. The value of 
the operations variable is the minimum 
number of different operations required to 
Solve a problem. For any given problem, 
this variable could take on a value of 1, 2, 
3, or 4. The value of the steps variable is 
the minimum number of steps required to 
reach the correct solution." These two vari- 


“When the observed 7. is either 0 or 1, we used 
the following transformation: 


log (2n; — 1) for p; = 0 


t= 1 
—— forp=1 
log ma for pi 
where 7, = the total number of subjects respondit 
to Item i. Note that putting 1 — pı rather the 
P: in the numerator of equation (1) makes í 
variable z, increase monotonically in the 
For example, if the length of a problem or eS 
number of steps needed to solve a problem d 
creased with the difficulty of the problem, Lond 
sirable that the model reflect this increase dire 
rather than inversely. atin 
* To avoid any ambiguity, we always first mu 


: 
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ables may be distinguished more clearly if 
we consider a problem that asks the student 
to find the average of eight numbers. Such a 
problem would give a value of 8 to the steps 
variable and a value of 2 to the operations 
variable. Seven steps of addition and one 
step of division are required to solve this 
problem. 

Before discussing the depth variable, we 
must say a few words about the length vari- 
able. Sentence length is frequently pro- 
posed as the most obvious and plausible 
factor contributing to sentence difficulty. 
This factor is generally determined by the 
total number of words in a sentence. Studies 
in language acquisition (Brown, 1970; 
Ervin, 1964; Miller & Ervin, 1963) give ev- 
idence of a gradual progression of children’s 
language development from one-word utter- 
ances to utterances consisting of a greater 
number of words. Many other developmen- 
tal studies have shown similar increases 
with chronological age (Davis, 1937; 
Loban, 1963; McCarthy, 1930; Menyuk, 
1963). Menyuk used mean sentence length 
as a measure of increased verbal maturity. 
Deutsch and Cherry-Peisach (1966) found 
that sentence length was a significant vari- 
able in distinguishing the speech of first- 
grade children of different socioeconomic 
groups. Braun-Lamesch (1962) found that 
Younger children cannot recall whole sen- 
tences easily. Because this evidence indi- 
cates that younger children in early lan- 
guage development lack the ability to proc- 
ess long sentences, it seems safe to say that 
long sentences are more difficult for children 
to comprehend than shorter sentences. For 
the present, we shall generalize these results 
and assume that longer word problems are 
more difficult than shorter ones. 

_Since Frege, philosophers and also most 
linguists agree that total comprehension of 
4 sentence requires recognizing and under- 
standing the structural relationships in the 
Sentence. Factors that focus on element 
Counts (e.g, number of words, number of 
Pronouns, number of syllables per one 
undred words) have been successful in ac- 
counting for only from 26 to 51% of the 
EC ui REDDE ciui 


mize th 
Sad e “sega of steps and then the number of 
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variance in comprehension scores (Ruddell, 
1964). This low percentage makes it ob- 
vious that the organization of language 
Structure needs more attention. The meas- 
ure of structural complexity that we use is 
based on the depth hypothesis of Yngve 
(1960). Yngve described a procedure that 
assigns a number to each word of a sen- 
tence. The number reflects how embedded 
the word is in the sentence; the more 
embedded the word is, the higher the num- 
ber assigned to the word. Yngve’s procedure 
for determining the characterizing set of 
numbers for any sentence consists of draw- 
ing a phrase-structure tree diagram of the 
sentence in question and then counting the 
number of left branches leading to each 
word. The number of left branches that ter- 
minate the longest string of left branches 
represents the maximum depth of the sen- 
tence. Figure 1 illustrates the constituent- 
structure tree represented by the sentence 
The man saw the boy. The sentence can be 
characterized by the following set of num- 
bers: 2, 1, 1, 1, 0; these are the respective 
number of left branches leading to each 
word in the sentence. 

The first occurrence of the terminates the 
longest string of left branches. Since the 
terminates two left branches, the maximum 
depth for this sentence is two. Yngve 
(1964) claimed that the depth hypothesis 
explains many of the complexities of lan- 
guage in terms of their function in allowing 
a maximum depth of about seven, but no 
more. 

Martin and Roberts (1966) modified 
Yngve's depth measure by using the aver- 
age number of left branches per word in a 
sentence as their measure of structural com- 
plexity. The depth of the sentence, The man 
saw the boy, is equal to the mean of its 
Yngve numbers, or (2 + 1 + 14+1+0)/5 
= 1.33. Martin and Roberts presented sen- 
tences to subjects that differed in depth. 
Out of six low-depth sentences, subjects 
correctly recalled an average of 3.9 sen- 
tences; recall for high-depth sentences was 
3.1 sentences. Martin, Roberts, and Collins 
(1968) demonstrated additional support for 
the depth hypothesis in a task of recall of 
single sentences. Other investigators (Per- 
fetti, 1969; Rohrman, 1968) found no sup- 
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port for the depth hypothesis in recall 
tasks. 


The conflicting reports cast some doubt 
on the general value of the Yngve hypothe- 
sis in recall tasks, However, the hypothesis 
may have some value for our understanding 
of word-problem difficulty. The notion of 
quantifying the structural complexity of a 
word problem and relating that complexity 
to problem difficulty is appealing. For a 
given problem, then, let its structural com- 


plexity, or depth, be formally defined as 
follows: 


1. The mean of the Yngve numbers is 
computed for each sentence in the 
problem. 

2. The highest value of this set of what 
might be called Yngve means is taken 
as a measure of the structural com- 
plexity of the problem as a whole. In 
other words, we assume that a problem 
is as complex as its most complex sen- 
tence. 


The procedure is illustrated by the fol- 
lowing simple example. Suppose the prob- 
lem is: 


V NP 


SAW 
Fia. 1. The constituent-structure tree of The man saw the boy. 
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T N 


THE BOY 


Jim has 40 bottles. Ken has 30 bottles. | 
They have how many bottles together? 


Sentence 1 can be characterized by the fol- 
lowing numbers: 1, 1, 1, 0, with a mean of 
.75. Sentence 3 can be characterized by the 
numbers 1, 1, 3, 2, 1, 0, with a mean of 1.33. 
The structural complexity or depth of the 
problem is 1.33. 7 
At this point, it is important to mention 
that coding the depth of a sentence objec- 
tively is not an easy matter. Any dur 
of the Yngve metric that does not, consi er 
this difficulty is naive. The coding probe 
was mentioned by Rohrman (1968) in his | 
attack on Martin and Roberts (1966). 
Martin Roberts characterized the Ud 
tence, “einen are not allowed out CA 
dark.” by the numbers 1, 4, 3, 2, 1, b od 
“are” was assigned a 4. Rohrman her j 
that it was very difficult to see what Bi 
tree could possibly give more than an 
branches leading to the auxiliary V! d 
“are.” It is certainly possible for a es 
sentence to have more than one id es 
tree, in which case there would be a often 
ent mean depth for each tree. This is 077. 
the case with ambighous sentences; ty?’ 


cally, they have more than one tree and a 
different mean depth for each. However, in 
the context of a complete word problem, 
none of the sentences used in the study is 
ambiguous. The problem of coding still ex- 
ists, however, because Yngve failed to 
provide an explicit set of rules for assigning 
numbers to words in a sentence. This is not 
meant as a criticism of Yngve, because pro- 
viding such a set of rules is essentially 
equivalent to providing a phrase-structure 
grammar for the given fragment of English, 
a clearly difficult task. Perhaps a more seri- 
ous difficulty is the assumption that trees 
can satisfactorily characterize the structure 
of English sentences, or, put another way, 
that neither transformations nor contexts 
need be considered. Fortunately, this as- 
sumption that the fragment of English used 
in the word problems is context-free is not 
too badly violated. 

To assess the degree of reliability be- 
tween two people coding these problems 
independently, J. Dexter Fletcher, a gradu- 
ate student in psycholinguisties, coded a 
sample of 20 problems. The Pearson coeffi- 
cient was .84 (r? = .71) between the Yngve 
values we obtained and those he obtained. 

The first 0, 1-variable is the sequential 
variable, the only variable in this study 
that emphasizes the relationship between 
Individual problems rather than the struc- 
ture of the individual problems. If a prob- 
. lem cannot be solved by the same opera- 
. tion(s), and in the same order as the prob- 
| lem that preceded it, the sequential variable 
| for that problem is assigned the value of 1. 
| fa Problem is of the same type as the 
` Preceding one, the value for this variable is 
0. Successful use of a sequential variable 
/. "8$ been made in the analysis of fractions 
| (Suppes, Jerman & Brian, 1968, Chapter 7) 
. ]Ànd in the analysis of arithmetic word 

Problems (Suppes et al., 1969). 

le emphasis on such a sequential vari- 

able is in the spirit of recent work on verabl 
earning, In free recall, for example, the im- 

: pes of the relationship between items 
A i list is well documented. Underwood and 
"hulz (1960) and Postman (1964) stated 
Muite explicitly that recall may be facili- 


ied by associations among items in a list. 
other words, recall of a particular item 
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depends not only on the item qua item, but 
also on the relationship between the item 
and ‘other items in the list. Other psycholo- 
gists have postulated the relationship be- 
tween list items and the general experimen- 
tal context to account for the response- 
learning stage in paired-associate learning 
(Keppel, 1964; McGovern, 1964; Under- 
wood, 1964). Using reaction-time technique, 
Carey, Mehler, and Bever (1970) presented 
subjects with a picture, then with a sen- 
tence, and asked them to judge the sentence 
true or false with respect to the picture. 
Results showed that the response latency 
for an ambiguous sentence clearly depended 
upon the particular syntactic structure of 
prior sentences that the subjects had heard. 
The abundance of evidence in the literature 
of the effects of interitem relationships indi- 
cates that this matter is of great psycholog- 
ical importance. 

The verbal-clue variable is the second 
0,1-variable. Brownell and Stretch (1931) 
suggested that a problem can be analyzed 
into several elements or factors, one of 
which is a verbal clue to the operations. 
This factor was not varied systematically, 
and so no systematic conclusions could be 
drawn about it. 

Kendler and Kendler (1962), who discuss 
problem solving in stimulus-response terms, 
elaimed that verbal behavior is necessary 
for problem solving. Furthermore, they said 
that problem-solving ability depends upon 
the development of verbal behavior that 
mediates between the problem stimulus and 
the problem-solving behavior. At one point, 
they suggested that investigation of the cue 
function of words might prove fruitful. 
Other work of Kendler and associates (e.g., 
Kendler & D'Amato, 1955; Kendler & Kar- 
asik, 1958; Kendler & Mayzner, 1956; 
Kendler & Vineberg, 1954) has demon- 
strated the critical role of verbal-discrimina- 
tive responses in problem solving. These 
findings suggest that the provision of a ver- 
bal clue to the operation (s) required to solve 
a word problem may facilitate solution. 

In the following problem, A wooden bor 
contains 23 red beads and 83 blue beads. 
How many beads does it contain in all? , the 
word and should help the person to discrim- 
inate among the four operations he could 
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use and to choose the one (addition) that 
he should use. In a sense, the word and is a 
cue or a label for the operation of addition. 
The importance of the verbal responses of 
labeling in a multitude of situations is well 
known (Miller, 1948). 

We define the verbal-clue variable as fol- 
lows: 

1. The verbal clue for problems requiring 

a single addition is the word and; if the 
problem does not contain this word, 
the verbal-clue variable for that prob- 
lem is assigned a value of 1, and 0 
otherwise. 

2. The corresponding verbal clues for the 
other operations are: (a) left or a 
comparative for subtraction, (b) each 
for multiplication, and (c) average or 
each appearing in the question sen- 
tence of the problem for division. 
Problems requiring multiple operations 
must contain all of the verbal clues 
pertaining to the required operations 
in order that the verbal-clue variable 
be assigned a value of 0. 

The order variable is the third 0, 1-vari- 
able. Burns and Yonally ( 1964) asked the 
question, “Does the order of presentation of 
numerical data in multi-step problems af- 
fect their difficulty?” Their results indicated 
that students were less successful in getting 
the correct answer to word problems when 
the numerical data were presented differ- 
ently from the order needed to solve the 
problem. These results suggested a new fac- 
tor, the order variable, assigned a value of 0 
if the problem can be solved by using the 
numerical data in the order given in the 
verbal statement of the problem. Note that 
the numerical data need not necessarily be 
80 used, but if they can be used in the order 
presented, the value for the order variable 
is 0. If the order of the numerical data must 
be reversed, then the value of the order var- 
iable is 1. 

The conversion variable is the last 0, 1- 
variable. If a problem requires a conversion 
of units (e.g. from months to weeks), the 
conversion variable for that problem is as- 
signed a value of 1, and 0 otherwise. The 
importance of this variable was suggested 
by the results of Suppes et al. (1969). 

In summary, the variables investigated 


a 
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were: 
X; = the operations variable—the mini- 
mum number of different operations 
required to reach the correct solution; 
the steps variable—the minimum 
number of steps required to reach the 
correct solution; 

the length variable—the number of 
words in the problem; 

the depth variable—the Yngve mean 
for the most complex sentence in the 
problem; 

the sequential variable—assigned a 
value of 1 if the problem is not of the 
same type (i.e., cannot be solved by 
the same operations) as the problem 
that preceded it, and 0 otherwise; 

the verbal-clue variable—assigned a 
value of 1 if the problem does not 
contain a verbal clue to the opera- 
tions required to solve the problem, 
and 0 otherwise; 

the order variable—assigned a value 
of 1 if the numerical data are pre- 
sented in some order other than an 
order in which they can be used to 
solve the problem, and 0 otherwise; 
the conversion variable—assigned a 
value of 1 if a conversion of units 18 
required to solve the problem, and 0 
otherwise. 

It should be noted that the higher the value 
assigned to a variable, the more difficult the 
problem is assumed to be. 


X: = 


X; = 
X= 


Xs = 


X= 


DESIGN AND EXPERIMENTAL PROCEDURE 


Subjects 


j lem- 
The 16 subjects who completed the probl 

solving program were sixth-grade students fos 
two elementary schools. Both schools are TUR 
Ravenswood City School District in Califo mi 
The district, in which 35,000 people live in ur ij 
square-mile area, comprises 5% of the total P nd 
School population. Thirty-three pond 2 ni 
county welfare families live within this sc E e 
trict. Both schools are essentially “depresse pu 
schools, In one school, 82% of the hu T 
black, and the average sixth-grade IQ was PE 
the other school, 59% of the children were 
and the average sixth-grade IQ was 99. 


Equipment , i 
The student terminals used in this proti T 

commercially available teletype machin ‘ter ab 

nected by private telephone lines to a compu 
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IMSSS. Five teletypes at one school and four 
at the other were operated in classrooms desig- 
nated for that purpose. 

The control functions for the entire system were 
handled by the PDP-1, a medium-sized computer 
with a 32,000-word core and a 4,000-word core 
interchangeable with any of 32 bands of a mag- 
netic drum, together with two large IBM-1301 disc 
files. All input-output devices were processed 
through a time-sharing system. Two high-speed 
data channels permitted simultaneous computa- 
tion and servicing of peripheral devices. 


Instructional Program 


Initial instruction on the teletypes consisted of 
explaining to each student the general procedure 
of taking turns on the machine and the general 
program logic. Hach student was assisted in find- 
ing the letters to type his name during the first two 
lessons, No student had any trouble learning how 
to type his name or how to answer the questions on 
the teletype. 

The program began each day by asking the stu- 
dent to type his assigned number and his name. If 
the student made an error or gave a fictitious 
name, he was asked to try again. If he correctly 
typed his number and name, the computer ad- 
dressed the student’s file and began with the item 
following the last one completed. The items were 
divided into two parts, with the set of instructions 
being presented before the set of problems. 


Set of Instructions 


The student was taught how to command the 
Computer to perform operations on given numbers 
by using a set of instructions. The complete set of 
Instructions is given in Loftus (1970). We list 
briefly and give some examples of the abbreviated 
Operation name that the student learned in the in- 
struction set. Student entries are underlined. An X 
Was the answer key. Suppose the student saw on 
the printout sheet before him: 


G1) 21, 


He would indicate that 21 was his answer by typ- 
Ing 1X, which says to the computer, “my answer is 
on Line 1.” The line number followed by X indi- 
cates what line the final answer is on. 

Second, A was the abbreviation for ADD. An 
example of how a student might use the A rule is: 


G 1) 36 
G 2) 41 
12A 3) 7. 


BY typing 12A, the student instructed the com- 
Biter to add the number on Line 1 to the number 


6 
n The letter G stands for given number. When- 
ver a student was given a word problem to solve, 
dete numbers in the problem were typed out as 
" en numbers just after the word problem itself 
B typed out. The reason for designing the pro- 
ees this way was to reduce the time required 
e student to input large numbers. 
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on Line 2. The computer then printed the result 
of the addition operation. The letter S was the 
abbreviation for SUBTRACT, M for MULTI- 
PLY, and Q for DIVIDE. 

The letter E was the abbreviation for ENTER. 
This instruction was used to enter a number that 
was not entered by the computer program. For 
example, in a problem that asked the student to 
find the number of days in 8 weeks, the student 
was required to enter the number 7, the number 
of days in 1 week. 

The following sequence of interactions between 
the student and the computer illustrates how a 
word problem was solved in this context. Again, 
student entries are underlined. The computer 
first typed out the problem, and then typed out 
the numbers in that problem. The student saw on 
the printout sheet before him : 


At the tree nursery, Tom counted 28 rows 
of pine trees. The forester said that there 
were 575 trees in each row. How many trees 
were there at the nursery... 


G 1) 28 
G2) 575. 


At this point, the student told the computer to 
perform a given operation, and designated the line 
numbers to which the operation should apply. For 
this problem, the student typically typed out 12M, 
meaning “multiply the number on Line 1 by the 
number on Line 2.” The computer responded by 
typing the result of applying the operation, or by 
typing an error message if the operation could not 
be validly applied. 

The student had to complete the problem by 
typing the line number on which the answer ap- 
peared, followed by an X. The complete protocol 
for a correct response in the above example might 
be: 

At the tree nursery, Tom counted 28 rows of pine 

trees. The forester said that there were 575 trees 

in each row. How many trees were there at the 
nursery... 


G 1) 28 
G 2) 575 
1.2M 3) 16,100 
3X 

Correct. 


If the answer was incorrect, “answer is wrong" ap- 
peared in place of “correct.” If the student had not 
indicated his final answer by using X, and if he 
asked the computer to perform an operation that 
could not be validly applied, he received an error 
message. In the above example, if instead of typing 
12M the student had typed 1.2MT, the computer 
would have responded by typing, “There is no 
rule name ‘MT’.” If the student had erroneously 
typed 1.2, the computer would have responded by 
typing, “No rule name given.” 

A word problem can often be solved in many 
ways. The student’s own experience and ingenuity 
determine which rule he uses and what strategy 


ial 
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he takes. The computer allows any valid Step, re- 
gardless of whether it helps reach the solution. 
Any combination of steps that reach a solution, 
valid within the rules, is entirely acceptable. For 
instance, the following problem can be solved in 
several ways. For an experiment, Susan mized 7 
ounces of glycerin and 14 ounces of alcohol with 
some water. The resulting mixture contained 45 
ounces. How many ounces of water were used? It 
can be solved: 


45 — (7 + 14) or (45 — 7) — 14. 


A more idiosyncratic solution, such as 45 — 
(7 X 3), is equally acceptable. 

In the instruction set, the student was given 
easier problems before being presented with more 
difficult ones. In several of the problems, the stu- 
dent was invited to ask for a hint after a certain 
time lapse by the message, “Type H and a space if 
you want a hint." If the student asked for a hint 
on the problem “What is (486 + 390) + 707?” he 
was told “First find 486 + 390. Then add that sum 
to 707.” No hints were available for multiple- 
choice problems; the student had to guess until he 
got the problem correct. This was also true of the 
word-problem set as opposed to the instruction 
set. 


Word-Problem Set 


The 100 word problems used in this study were 
designed to be of appropriate difficulty for sixth- 
grade students. The word problems are listed in 
Loftus (1970). These 100 problems were divided 
into 50 pairs; a pair consisted of two problems both 
of which could be solved by the same operation or 
Sequence of operations. The 50 pairs were then 
randomly permuted with the following restriction: 
no 2 pairs whose problems required identical opera- 
tions for solution could be presented adjacent to 
each other. Five randomizations were obtained, 
and each subject was assigned to one of the five 
tandom sequences. The problems were arranged 
in this way so that for a given pair of problems 
the first problem never followed a problem of the 
Same type; thus, the sequential variable for that 
problem always received a value of 1. The second 
problem in the pair always followed a problem of 
the same type; the Sequential variable for that 
problem always received a value of 0. More gen- 
erally, the problem set was designed to provide 
many different combinations of variable values. 

To solve the set of problems, the student used 
the rules he learned in the instruction set. As be- 
fore, the computer first typed out the problem, fol- 
lowed by the numbers in that problem. Then, us- 
ing any of the rules mentioned above, the student 
told the computer what to do with these numbers. 
After the computer typed out all the numbers in 
the problem as “given numbers,” the type wheel 
of the teletype was positioned at the left-hand 
side of the paper. The student made his Tesponse, 
and then the computer positioned the type whi 
at the center of the page, typed the line number 
and finally the result of the operation the student 
had commanded the computer to perform. If the 
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final answer was correct, the computer typed the 
message “correct.” If the final answer was incor- 
rect, the computer typed “answer is wrong.” In 
both cases it went to the next problem, 

When working on the teletype, the student was 
not allowed to use pencil or paper. Every problem 
was worked on the machine, so that all responses 
could be recorded. 

Following the “goodbye” message the student 
was told “please tear off on dotted line.” A dotted 
line was printed, and the student tore off his 
printout and gave it to the experimenter. 

Typically, it took about 8 weeks to complete 
both the instruction set and the word-problem set, 
Each portion took 4 weeks. However, the students 
at one school had such initial difficulty with the 
program that they were allowed to repeat portions 
of the instruction set before beginning the problem 
set, since we wanted them to learn the rules as 
thoroughly as possible before beginning to solve 
the test problems. This group took a mean of 12 
weeks to finish the program: 8 weeks for the in- 
struction set and 4 weeks for the word-problem set. 
Students in the other group took a total of 8 weeks 
to complete the work. 


REsurrTS 


The first step in analysis was to obtain 
regression coefficients for each of the eight 
variables described earlier. A stepwise, mul- 
tiple linear regression analysis program 
(BMD 02R), adapted for Stanford Univer- 
sity's IBM 360 computer, was used to ob- 
tain regression coefficients, multiple correla- 
tion R, and R?. : 

The mean percentage of correct solutions 
for 16 subjects was 47.09. The regression 
equation was: 


—8.24 + A8X*** + 04Xa 
+ 02X4 + .88X%7* + 61X3" 
+ -20X is + .13X;; + 49X5 


*p < .05; 
Moretot; 
*"*5 « .001; 


with a multiple R of .83, a standard error of 
estimate of .52, and an R? of .70. The rea- 
son that X; (length) was significant, d 
spite of its small regression coefficient, | 
because the standard error of the regression 
coefficient was .006. The T value is pe 
puted by dividing the regression eee y 
by its standard error. Table 1 pa 5 
regression coefficients, standard errors o d 
regression coefficients, computed T n 
and partial correlation coefficients for e 


gi = 


— 
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of the eight independent variables. Table 2 
presents the independent variables in order, 
as introduced in the stepwise regression, 
with corresponding multiple correlations. 

The partial correlation coefficients indi- 
cate that Xs, the sequential variable, is the 
most important of the eight variables. The 
operations variable, X;, the depth variable, 
X, and the length variable, Xs, are also 
significant predictors of the probability of & 
correct response for each item. The conver- 
sion variable, Xs, is moderately significant. 
A rough indication of the goodness of fit of 
the regression line is given by the multiple 
correlation coefficient (E) and its square 
(R?) which is an estimate of the amount of 
variance accounted for by the regression 
model, which in this case is 70%. 

Figure 2 is a graph of the predicted and 
Observed proportions of correct responses 
for each of the 100 items. The probabilities 
are plotted as a function of the rank of 
Observed proportion of correct responses. 
Consequently, the curve of the observed 
probabilities is monotonically decreasing 
and smoother than the predicted curve. An 
inspection of the two curves shows a rea- 
sonable fit for the regression model, but the 
model does not fit the very difficult or very 
easy items well. For an analysis of goodness 


TABLE 1 
Regresston COEFFICIENTS, STANDARD ERRORS OF 
Recression Comrricrents, CowPuTED T 


VALUES, AND PARTIAL CORRELATION 
COEFFICIENTS 
Variable | P | gp | emi | aerdata 
coefficient T'value coefficient 
Xioperations| .483 | .103 | 4.715 | .443 
X. steps .041 | .054 | .761 | .080 
Xi length .017. | .006 | 2.850 | .286 
X depth .879 | .229 | 3.839 | .373 
Xi sequential | .611 | .106 | 5.753 | -516 
Xe verbal clue! .196 | .119 | 1.651 | -171 
Xi order .133 | .125 | 1.007 | .111 
Xi conversion | .494 | .220 | 2.252 | .230 


Note.—N = 16. 
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TABLE 2 


ORDER OF INTRODUCTION OF THE VARIABLES IN THE 
REGRESSION WITH CORRESPONDING 


CORRELATIONS 

Variable r 
X| operations .67 
X: sequential TT 
X; length 79 
X, depth .81 
X; conversion .88 
X, verbal clue .88 
X; order .88 
X; steps .88 

Note.—N = 16. 


of fit, the predicted probability, p;, of a cor- 
rect response for Problem 7, was first calcu- 
lated from the regression model, and then x? 
was calculated, where: 


x? = (250: — pN) /p: — p:)N], 


and f, = observed frequency of correct re- 
sponse, N — number of students. For the 
above model, x? — 206.74. 

This rather high value for chi-square is 
an indication that the correspondence be- 
tween the observed and expected frequen- 
cies is not, very close. À more detailed look 
at the components of chi-square shows that 
a few problems made extremely large con- 
tributions to the total chi-square. The fol- 
lowing problem, for example, contributed 
6.3% to the total chi-square obtained. A 
school playground is rectangular, 278 feet 
long and 21 feet wide. What is the total 
length of the fence around the play- 
ground? The observed proportion of cor- 
rect responses for this item was .06, while 
the predicted proportion was .50; clearly, 
this is a poor fit. As a second example, the 
following problem contributed 5.3% to the 
total x? obtained. Mary is twice as old as 
Betty was 2 years ago. Mary is 40 years old. 
How old is Betty? None of the 16 sub- 
jects solved this problem correctly, although 
.39 was the predicted proportion of correct 
responses. The large deviations between the 
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PROPORTION CORRECT 
RANKED ACCORDING TO OBSERVED DIFFICULTY 


Fra. 2. Problem rank order according to proportion correct. 


observed and predicted results for certain 
problems, such as the two just mentioned, 
‘emphasize the need for a more elaborate 
theory. 

Most of the predictions can be made by a 
small number of variables, and the inclu- 
sion of additional variables adds little. In 
the present case, most of the variance can 
be accounted for by variables X; Xs, Xi, 
and Xs. If we reduce the number of vari- 
ables in the regression equation to include 
only these, the reduction in multiple R and 
E? is slight. The regression equation be- 
comes: 


z, = —2.89 + 64Xq + 02X 
+ 64Xu + 63Xi5, 


r of 
with a multiple R of .81, a standard erro 
estimate of 54, and R? of .66. The standard 
errors of the regression coefficients m All 
081; Xs, .006; X4, 225; and Xs, 109. 
four variables are significant. 
DISCUSSION ia 

The results show the following e 
are important in determining word-P purs 
difficulty: sequential, operations, E n » 
length, and conversion. These 
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imply that a word problem will be difficult 
to solve if it differs from the problem type 
that preceded it, if its solution requires a 
large number of different operations, if its 
surface structure is complex, if it has a 
large number of words or if it requires a 
conversion of units. The multiple correla- 
tions and thus the predictive results of this 
analysis are rather impressive. There is 
considerable difficulty in intuitively rank 
ordering the expected proportions of correct 
responses obtained in word problems. We 
believe that our results give a sense of the 
real possibility of analyzing and predicting 
in terms of meaningful variables, the re- 
sponse performance of children who are 
solving arithmetical word problems. At first 
glance, the problem set appears complex; 
yet, with a few variables, we have brought 
a considerable amount of order to it. In 
view of the intrinsic complexity of this type 
ü p solving, the fit obtained is excel- 
ent. 

_It is interesting and potentially instruc- 
live to compare the results of performance 
of this "disadvantaged" group with the re- 
sults of a similar study using subjects with 
à mean IQ greater than 120 (Suppes et al., 
1969). The important variables reported by 
Suppes et al. were the sequential, opera- 
tions, and conversion variables. Depth and 
order were not investigated in that study. 
The most important variables in the present 
study were operations, sequential, depth, 
and length; conversion was of secondary, 
although significant, importance. The most 
Suggestive finding is the importance of the 
Sequential and operations variables. These 
two variables are highly significant deter- 
Minants of difficulty for the bright as well 
as disadvantaged students. Whether stu- 
dents are bright or dull, they are more 
likely to solve a problem correctly if it is 
Similar to the problem that preceded it or if 
its solution requires a small number of dif- 
erent operations. The implication is that 
Many aspects of the internal processing 
done by students when they solve problems 
do not differ for children of differing mental 
ability, 

Recall that in the Results section two 
Problems were cited that contributed most 
eavily to the total chi-square obtained. In 


the first problem, subjects typically multi- 
plied the two numbers together, or added 
the two numbers together only once. The 
difficulty seems to arise from confusion 
about what a perimeter is, as distinguished 
from an area, and how to find a perimeter. 
The second problem about age is a typical 
puzzle that students have difficulty solving. 
The source of difficulty may be in under- 
standing how to begin the analysis. In any 
case the variables considered here do not 
adequately characterize the difficulty of ei- 
ther problem. The generality of this finding 
is supported by the fact that the same two 
problems contributed most heavily to the 
chi-square obtained by Suppes et al. (1969) 
in their study of bright children. 

This study represents only a tentative 
preliminary effort at the construction of a 
more mature theory of problem solving. On 
the one hand, more refined analysis with 
data from larger numbers of students is 
needed. On the other, a deeper conceptuali- 
zation of the internal processing engaged in 
by the subjects is needed. The definition of 
more variables may not be sufficient, but 
rather the variables considered here, and 
probably additional ones as well, must be 
embedded in a processing model that is ex- 
plicitly temporal in character. Probabilistic 
automata as examples of such models, but 
applied to arithmetic problems in standard 
format—a much simpler context than the 
present one—are described in Suppes 


(1969). 
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PHYSICAL AND SOCIAL DISTANCING IN TEACHER-PUPIL 
RELATIONSHIPS’ 
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The behavior of pupils who had been assigned seats in the front, mid- 
dle, and back rows of their 14 grade-school classrooms was assessed 
through observation and teacher interview. New seats were then as- 
signed to the subjects, and a second set of measures, which also in- 
cluded pupil questionnaires, was taken. The data indicated that (a) 
teachers strive to assign seats in ways that minimize classroom dis- 
ruption; (b) children assigned by teachers to the front row are more 
attentive to classroom activities than classmates in the middle and 
back TOWS; and (c) occupancy of seats in the front, in contrast to 
those in the middle and back, affects in a positive manner the way in 
which pupils are perceived by their teacher and peers, and the way in 


which pupils evaluate themselves. 


The present investigators had observed 
that children seated in the front row of ele- 
mentary school classrooms often appeared 
More attentive and less disruptive than their 
peers in the back. This observation, infor- 
mal conversations with teachers, and find- 
Ings reported by Goldenberg (1969) and 
Johnson, Feigenbaum, and Weiby (1964) 
Talsed interest in studying the basis on 
Which teachers assign seats and the effect 
that particular assignments have on pupils. 

Dimensions that seemed relevant to the 
Seat assignment process were physical and 
Social distance. Several studies (Campbell, 
Kruskal, & Wallace, 1966; Leipold, 1963) 
ave suggested that people select seats in 
Ways which enable them to use physical 
distance as an aid in maintaining social dis- 
tance. For instance, high school and college 
n indicating a need for social dis- 
ance from their instructors were found to 
choose seats in the back of their classrooms 
(Levinger & Gunner, 1967; Walberg, 1969). 

Extrapolating from these studies, one 
— 
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might assume that those assigning seats do 
so partly on the basis of the social distance 
they wish to maintain from individual as- 
signees. With respect to the classroom, for 
instance, the teacher's physieal distance 
from pupils she assigns to front, middle, 
and back row seats may be related to the 
social distance she wishes to maintain from 
them. 

Regarding the effect a seat assignment 
may have on a pupil, two factors seem rele- 
vant: the pupil’s understanding of why he 
was assigned to a particular seat, and 
whether the seat’s location facilitates atten- 
tion. Seat assignment or the allocation of 
“territory” in a social situation carries cul- 
turally coded messages (e.g., at formal dip- 
lomatic dinner parties, protocol requires 
that those of higher status be seated near 
the head of the table). It seems probable 
that this status-territory relationship is in 
some sense understood by school-age chil- 
dren and that those assigned seats in the 
front might assume, at some level, that 
their teacher values them in a special way. 
Even pupils assigned to the front because of 
their disruptiveness may conclude this if 
they compare themselves with disruptive 
peers who sit in the back. The location ofa 
child’s seat in and of itself would appear to 
be important as well. It seems likely that 
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the proximity of seats in the front to the 
teacher's work area would lead the occu- 
pants to receive more individual attention 
than classmates in the back. Other advan- 
tages inherent in seats in the front of the 
room, such as proximity to the blackboard, 
and the lack of distractions in the line of 
sight, speak for themselves. 

To this view, a method was devised to 
investigate teachers’ methods of assigning 
seats, and possible differences in the atti- 
tudes and behavior of pupils in front-, mid- 
dle-, and back-row seats. 


MzrHoD 


Subjects 


Subjects were drawn from one academically 
oriented kindergarten, four first grades, one second 
grade, two third grades, two fourth grades, three 
fifth grades, and one class for the educable re- 
tarded. 


Procedure 


One of the present authors addressed faculty 
meetings in two elementary schools explaining 
that he was planning to study social interactions 
among children, After a discussion of the research 
method to be used, 14 teachers volunteered to 
participate. The method followed from this point 
is spelled out below, step by step. 


Premanipulation Procedures 


ing time periods when the teacher worked with 
her entire class. After 10 minutes, the assistant be- 


sample, which consisted of all five children in the 
front, middle, and back tows.’ She then observed 


= 


*It was expected that classrooms ed i 
atypical fashions would be included capella 
Two classrooms with slightly modified 


and, as instructed, the assistants selected for ob- 
servation five pupi 
the equivalent of the front, middle, and back rows 
of these classrooms, respectively. In the few class. 
rooms where there were more than five pupils per 
TOW, the assistants were instructed to select the 
five pupils whom they could see most clearly. In 
classrooms where there were an even number of 
TOWS, the assistants were instructed to observe 
every other child in the two middle rows. 
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two subjects each minute until she completed the 
sample. More specifically, 10 and 20 seconds after 
she began observing she noted the category* that 
best described the first subject’s behavior at each 
of those moments. She then had a 20-second Test 
interval. At 40 and 50 seconds after she began ob- 
serving, she recorded, in each instance, the cate- 
gory that best described the second subject’s be- 
havior. In 7/ minutes the assistant had completed 
the first subset of observations on a 15-subject 
classroom sample. As time permitted, she went 
through this procedure an additional five or six 
times and took a total of from 12 to 14 momentary 
samples of each subject’s behavior. 

A second set of observations was made within 
a week of the first. The procedure followed in 
making these was the same as that employed dur- 
ing the first. If any of the subjects were absent, a 
child occupying a seat near the absentee’s was 
added to the sample. 

Teacher Rating 1. After explaining the 5- 
point scale used, the assistant asked the teacher to 
tate the pupils who had been observed on four 
dimensions: (1) according to how attentive the 
children are to classroom work; (2) according to 
how disobedient and disruptive the children are, 
according to how much trouble they make; (3) 
according to how withdrawn the children are 
from other children; that is, according to how 
shy they are about making friends; and, (4) ac- 
cording to how likeable the children are. 

Random seating plans designed. The teachers 
were told that new seating plans would be designed 
to “meet the needs of the children.” Actually the 
children were randomly reassigned seats on the 
following basis. One-third from each row under 
study were randomly selected and reassigned seats 
in their original row. Of the remaining two-thirds 
in these rows, half were randomly selected and re- 
assigned seats in each of the other two rows under 
study. Those children who had not been observed 
were randomly reassigned to remaining seats. 


Postmanipulation Procedures 


Use of random seat assignments. The teachers 
began utilizing the investigator-designed seating 
plans immediately following the schools' me 
recess. All but one fifth-grade teacher introduces 
the new seating plan as if they had develope 
them. 


* The categories used were: own work (the ie 
is attending to the teacher or to work assigned jio 
her and is not interacting with other pupils); in 
activity or unassigned activity (the child is n 
attending to the teacher or work which she > e 
assigned or is apparently not attending to a ith 
at all); social friendly (the child is interacting VIP 
another in a friendly manner but not in of i 
complete a task assigned to either of them); $9 To. 
work (the child is doing assigned work iP ^. 
operation with another child); social unfriend 
(the child is interacting with another chil 
hostile manner or is disrupting the class). 
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Behavioral observations. The third and fourth 
observations took place 2 to 3 weeks after the new 
plans had been introduced. They were conducted 
by the second assistant; that is, the one who had 
not worked in that class previously. The obser- 
vations were made in the manner described above 
and included all subjects who had been observed 
earlier. 

Teacher Rating 2. Following the fourth obser- 
vation, the assistant had the teacher rerate the 
subjects on the dimensions described above (at- 
tentiveness, dieruptiveness, shyness, and likability). 

Administration of the "Me and My Class" 
Questionnaire. A two-page questionnaire was ad- 
ministered in the seven third-, fourth-, and fifth- 
grade classrooms. It included (a) sociometric 
probes in which respondents named classmates 
who were most and least attentive, shy, disruptive, 
and likable (to their teacher); (b) questions about 
seat location preference; and, (c) questions de- 
signed to elicit the pupils’ feeling about themselves 
to investigate whether self-view was related to 
seat assignment. 

Teacher interview. Following a schedule, one 
author interviewed each teacher about the tech- 
niques she has used with inattentive, disruptive, 
and shy pupils and about seating arrangements and 
the effects the investigators’ had had on her 
pupils’ behavior. At the end of the interview the 
teacher was administered the Marlowe-Crowne 
(1964) Social Desirability Scale, a paper-and-pencil 
A designed to assess respondents' needs to please 
others, 


RESULTS 


Teacher-Reported Seat Assignment 
Procedure 


The teachers’ descriptions of their proce- 
dures, vis-a-vis seat assignment, were highly 
Similar. They commonly described permit- 
ting their pupils to select their own seats for 
the first few days, which allowed them time 
to “get to know the children better.” At this 
Point they designed and instituted a seating 
Plan in which few changes were made until 
the investigators’ plan was introduced. 

In assigning children to seats the teachers 
Made two decisions: to what part of the 
Toom to assign them and next to whom they 
should sit. Questioning revealed that the 
teachers were selecting children’s neighbors 
in a way which would aid them in main- 
taining behavioral standards in the class- 
Toom. Specifically, their policy was first to 
Separate and disperse disruptive pupils, as- 
Signing them to seats in locations where 
"hey would be less disturbing to the class 

1e., seats at the end of a row, between two 
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“good” pupils, in the back, or near the 
teacher’s desk). The remaining pupils, it 
was explained, were then assigned to vacant 
seats next to classmates with whom they 
would not cause trouble. Several of the 
teachers used the terms “instinctively” and 
“according to the chemistry of the children” 
to describe the basis on which they made 
the assignments. 

All teachers responded to the question, 
“On what basis do you assign children to 
seats?” by addressing themselves to ex- 
plaining how they decided who should sit 
next to whom. Their policies did not include 
rules which determined who should sit in 
the front, middle, and back of the room. 
Although pressed by the investigator to ex- 
plain this aspect of their procedure, the 
teachers seemed to have few rules which 
they were aware of and could verbalize. 
Most said only that children with sight or 
hearing problems should be assigned to the 
front. 


Teacher Rating 1 


As a result of the sorting procedure that 
teachers were required to use in ranking the 
pupils, teacher ratings on all four variables 
(attentiveness, disruptiveness, shyness, and 
likability) were rectangularly distributed 
with a mean of 3 and a variance of 2.5 (In 
those classes in which the sample was not 
15, the mean and variance differed, but not 
significantly, from the above figures.) The 
mean expected rating for the children in the 
front, middle, and back rows was also 3, 
with an expected variance of 2. Inspection 
of the distribution of rankings within class- 
rooms suggested that individual teachers 
had not assigned their pupils randomly with 
respect to how they rated them on these 
variables. Although there was no main ef- 
fect across classrooms, tests of the interac- 
tion term (classroom by row) indicated 
that the teachers had assigned seats sys- 
tematically, but in different ways vis-a-vis 
how they rated the pupils on attentiveness 
(F = 1.70, 26, « df, p < .02), shyness (F = 


5 The teachers used a scale of from one to five, 
with the restriction that each category was to be 
used equally often. If the sample size was not a 
multiple of five, the teacher was required to use 
the categories as nearly as frequently as possible. 
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TABLE 1 
BEHAVIORAL: OBSERVATION PERIODS 1 AND 2: 
Mean PERCENTAGE or TOTAL OBSERVATIONS IN 
WHICH SUBJECTS WERE ENGAGED IN EACH 
BEHAVIORAL CATEGORY 


‘Behavioral observaton period yes MEN qa 
Observation 1 
Own work 76.4 70.2 67.8 
Inactivity or unas- 

Signed. activity 18.0 24.2 22.7 
Social friendly 4.1 4.7 7.8 
Social work 1.8 ot 1.6 
Social unfriendly 12 1 5 

‘Observation 2 
Own work 69.4 65.4 67.9 
Inactivity or unas- 

signed activity 25.1 27.2 26.1 
Social friendly 4.2 6.6 5.6 
Social work 5 4 3 
Social unfriendly 9 5 1 


241, 26, « df, p < .01), and likability (F = 
2.55, 26, a df, p < 01).* 

"Those teachers who scored above the me- 
dian on the Marlowe-Crowne Social Desira- 
bility Scale tended to rate the children in 
the front of their classrooms an average of 
-60 units higher on likability than those in 
the back. This contrasted with their coun- 
terparts scoring below the median who 
tended to rate those in the back of their 
rooms an average of .46 units higher on the 
likability dimension. A t test confirmed that 
these two groups of teachers differed with 
respect to their ratings of front and back 
row pupils on likability (£ = 1.98, df = 12, 
p < .05, one-tailed). 


Behavioral Observation Periods 1 and 2 


The number of times a child’s behavior 
had been observed was tallied, and the per- 
centage of this total that fell into each be- 
havioral category was computed. In each 
classroom, a mean Percentage for each cate- 
gory was obtained for subjects in the front, 
middle, and back rows. The means across 


"In these analyses of variance tests, and 
which follow, the classroom was treated a im 
ject, and the front, middle, and back rows as re- 
peated measures on the subject. 
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classrooms of these mean percentages are 
presented in Table 1.7 It can be seen that 
subjects in the front were engaged in a 
larger percentage of own work and in 
smaller percentage of inactivity or unas. 
signed activity than their counterparts in 
the other rows. In view of the fact that the 
row means in the 14 classrooms for own 
work, inactivity and unassigned activity, 
and social friendly had a limited range of 
variability and appeared to be normally 
distributed, analysis of variance tests were 
performed. A main effect for rows on own 
work was found in the first observation pe- 
riod (Table 2), but not in the second. 
Contrasts* demonstrated that during the 
first observational period those in the front 
row were engaged in a significantly higher 
percentage of own work, than those in the 
back row (p < .05). 


Change in Teacher Rating from Pre- to 
Postmanipulation 


To determine if relocating pupils effected 
their teacher's perception of them, subjects 
were grouped into those who had been 
moved forward, backward, and into new 
seats without a row change. Since most of 
the differences between the teachers' ratings 
of pupils before and after the seat changes 
were of one unit, scores were grouped into 
three categories, positive change, no change, 
and negative change. The data for each 
variable were cast into contingency tables 
and chi-square tests performed. ° 

No significant changes were found in the 
teachers’ ratings of their pupils on shyness 
and disruptiveness. The overall chi-square 
associated with attentiveness was sigue 
cant (p « .05) while that associated Mi 
likability did not reach an acceptable leve 


7 2 the 
"There was a sizeable difference between 
overall percentage of own work the baa Ea 
engaged in during Behavioral Observation | able 
1 and 2. The difference could be partly attribu spl 
to the fact that as the Christmas vacation d 
proached the teachers became somewhat Judd 
permissive and partly to the fact that the stu ua 
had become accustomed to the assistant’s pes the 
*To lessen the likelihood of Type II erro a 
two orthogonal contrasts which were most pe e 
ingful in terms of the present hypothe 
selected and used consistently: the fron d back. 
the back and the middle versus the front an 
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(p < .10). Subsequent analysis of pupils 
moved to new rows indicated that those 
moved forward were more likely to be per- 
ceived as becoming more attentive (p < 
01) and more likable (p < :05) than those 
moved backward. These data. are presented 
in Tables 3 and 4. 


Change in Behavioral Observation Scores 
From Pre- to Postmanipulation 


To investigate the effect of seat change 
on the behavioral observation scores the pu- 
pils received, behavioral change scores were 
obtained by subtracting premanipulation 
performance (the first and second observa- 
tion periods) from postmanipulation per- 
formance (the third and fourth observation 
periods). In each classroom, mean change 
scores in each of the five categories were 
computed for those who had been moved 
forward, backward, and to seats in their 
previous row. 

Although a significant main effect was 
not found, Table 5 shows that, as expected, 
those who were moved forward showed the 
greatest mean increase in the percentage of 
time in which they were engaged in own 
work, and the greatest mean decrease in the 
percentage of time in which they were en- 
gaged in inactivity and unassigned activity. 


Pupil Questionnaire 
The number of votes cast by a class on 
each question was totaled and the percent- 


age of that sum received by each pupil cal- 
culated. Then class-vote scores were ob- 


TABLE 2 
BEHAVIORAL OBSERVATION PERIOD 1: 
F Ramos ron Row EFFECT AND FOR CONTRASTS 


Observation 1 
Category Front | Middle | pow 
EN [estar ie 
(af 13) | df = 1/13)| & = 2/29 
Own work 7.11* 43 | 3.70* 
lnactivity or unas- 

Signed activity 2.86 2.28 2.55 
Social friendly 3.72 1.76 3.32 


*» < 05. 
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TABLE 3 


CHANGE IN TEACHER’S RATINGS 
OF ATTENTIVENESS 


Direction of change in ratings* 

Seat change More ay tea 
SSE [atte aE | Tol 
Moved forward 24 27 12 63 
Moved backward 9 39 22 70 
Total 33 66 34 133 


*x? = 11.59, p < .01. 


tained by averaging the percentage of votes 
received by pupils in each classroom who 
had been assigned randomly to rows® in the 
front, middle and back of the room, respec- 
tively. 

As shown in Table 6, differences were 
found in the class-vote scores received by 
pupils in the front, middle, and back on the 
first set of questions asked (who is most 
and least attentive, shy, and likable to the 
teacher). This set of probes paralleled those 
posed to the teachers during the teacher 
ratings. All eight of the correlations be- 
tween the postmanipulation teachers’ and 
pupils’ ratings were statistically significant, 
ranging in size from .26 to .56. 

To determine whether the cultural signif- 
icance of territory and/or the advantages 
inherent in a seat located in the front of the 
room was/were reacted to by elementary 
school children, a second set of questions 
sought to determine seat location prefer- 
ence. The response to these questions was 
similar and best illustrated with an exam- 
ple. In one probe, the subjects were asked 
whether they would prefer a seat in the 
front, middle, or back of their classroom. 
Responses were cast into a contingency 
table and it was found that 56.7% of the 
177 respondents (x? = 47.2, df = 2, p < 
001) preferred front row seats. 

A final set of questions was designed to 
elicit the children’s feelings about them- 


* Questionnaire data were obtained from all 
pupils in upper-grade classrooms. With theee data, 
those in the front rows (the third of the pupils 
whose seats were closest to the front) were com- 
pared with those in the middle and back rows. 
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TABLE 4 
CHANGE IN TzacHER's RATINGS OF LIKABILITY 


Direction of change in ratings* 

Seat change 
lable | change | likable | Total 
Moved forward 23 28 12 63 
Moved backward 13 35 22 70 
Total 36 63 34 133 


ax? = 6.14, p < .05. 


selves and to investigate whether self-view 
was related to seat assignment, The items 
in this set were: “I pay attention to the 
teacher;” “I am shy;” “I fool around in 
class;” and “the other children like me.” 
Assigning values of 3, 2, and 1 to the re- 
sponse options of “always true,” “some- 
times true,” and “false,” analyses of vari- 
ance were performed. Contrasts indicated 
that those in the front rated themselves as 
significantly smarter (p < 01) than their 
classmates in the back. Also worthy of note 
was the fact that those in the front (and 
middle row seats) tended to respond more 
positively to “the teacher likes me” than 
those in the back. 

In this set of questions the children were 
asked, as were their teachers and peers, 
whether they were attentive, disruptive, 
shy, and likable. The correlations between 
the pupils’ self-evaluation, and the evalua- 
tion of their peers and teachers ranged in 
size from .11 to .32. Although most of these 
correlations were significant, they were not 
as large as those between peer and teacher 
ratings. 


Discussion 


Seat Assignments from the Teacher’s and 
Pupil’s Perspective 

‘From the postexperimental interviews 
with the participating teachers it was 
learned that the consideration of seat as- 
signment procedures as a social-psychologi- 
cal problem is neither included in teacher 
training programs nor discussed by elemen- 
tary school teachers. Nonetheless, the pro- 
cedures the teachers described using were 
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highly similar, suggesting that in assigning 
seats they were reacting to common psycho- 
logical and/or environmental demands, The 
one demand all fourteen teachers spoke of 
in particular was that of achieving class. | 
room control. That achieving classroom 
control would be an objective which guides 
teacher behavior is understandable. In ad- 
dition to the personal discomfort which a 
disruptive class may cause its teacher, the 
school often exacerbates the problem by 
placing great weight in teacher evaluation 
on class control. By assigning seats in a 
way which separates children who talk to 
each other the teacher can make progress 
toward classroom control which is speedy 
and clearly visible. 

In any classroom there are many differ- 
ent ways in which teachers can assign spe- 
cific pupils in order to minimize disruption. 
Because the social-psychological implica- 
tions of the seat assignment process are not 
made explicit to teachers, the present au- 
thors believe that they probably design 
their specific seating plan as they unknow- 
ingly react to cultural rules and their own 
personal needs. The most obvious cultural 
rule is that seats (or territories) are to be 
assigned according to the status individuals 
have. The impact status has in terms of 
classroom seat assignment was most clearly 
seen by the investigators in an inner-city 
school in which some preliminary observa- 
tions were made: in the five classrooms 
used, not 1 of the 10 white students b. 
assigned to a seat further back than the 
second row. The link reported above be- 


TABLE 5 o 
BEHAVIORAL CHANGE Scores: POSIMANIUEED 
Scores Minus PnEMANIPULATION SCOR 


ts 
Subjects | Subects with Subject 


Behavioral category | moved, | samerow, | reward 
cde 
Own work 8.96 4.92 4h 
Inactivity or unas- —4,68 
signed activity | —7.02 | —6.88 | Pio 
Social friendly —3.35 | = -66 -»9 
Social work .84 2 En „64 
Social unfriendly .59 D 


was 
* Based on data from 13 classrooms. One 
lost due to an error in assigning seats. 
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TABLE 6 


Mean CnAss-VorE SCORE RECEIVED BY Occupants or SEATS IN THE Front, MIDDLE, AND BACK 
AND AssocrATED CONTRASTS 


Class-vote score F ratio for 
Criteria 
Fi id 
Front | Middle | Back | Row effect | Front versus back versus 
Who has been: 
...most attentive 5.44 2.15 2.77 7.45** 6.37* 9.74* 
.. least attentive 3.51 3.45 3.31 ns 04 .01 
...most shy 4.23 3.97 2.55 ns 7.22* .30 
.. least shy x 3.76 3.04 3.58 ns .09 1.72 
...most disruptive 2.83 3.98 3.70 ns .49 .52 
...least disruptive 4.35 3.09 2.88 ns 2.56 1.40 
...Imost likable to teacher 4.76 2.15 3.29 ns 1.00 13.61* 
.. least likable to teacher 3.54 3.73 3.23 ns .08 .56 


*p < 05, df = 1/6. 
** p < 01, df = 2/12. 


| tween the teachers’ responses to the Mar- 


lowe-Crowne Social Desirability Scale and 
their assignment of pupils they perceived as 
likable, speaks to the other factor; that is, 
to the role teachers’ personal needs play in 
the seat; assignment process. 

The premanipulation difference found be- 
tween the front and back row pupils in the 
extent to which they engaged in own work 
Was probably partly attributable to the 
manner in which the teachers had originally 
assigned the children to seats, a process 
Which itself deserves documentation. The 
present, data suggest that the teachers, to 
varying extents and probably unknowingly, 
M have assigned several pupils whom 
they had judged as more attentive to seats 
Ten front of the room. During the first 
d weeks of school, they might have 
mun assurance that the children were un- 
Euer and learning: front row pupils' 
Bake to the teacher’s work area may 
Coena the primary source for this 
aa difference between the premanipula- 
e ehavioral observation scores of those 
Lh e front and back rows was probably 
Bou attributable to the effect seat 
On lon has on an individual's behavior. 
k aar finding was that the majority of 
Eu 3 prefer seats in the front of the room. 
4h eir response to an open-ended probe 

Pearing on the pupil questionnaire, more 


than half the children indicating preference 
for a front row seat stated that their choice 
reflected their belief that they would work 
most effectively in this location. 


Representativeness of the Present Sample 


This study was conducted in two elemen- 
tary schools which serve middle-class 
neighborhoods. Unlike the inner-city 
schools where some pilot work was done, 
the atmosphere in the present schools was 
“relaxed.” There were markedly fewer 
children with behavioral problems, and the 
children, whether in their classrooms or in 
the corridor, were kept under control with 
relatively little effort. These well-behaved 
children provided a stringent test for the 
hypothesis pertaining to differences in the 
behavior of the children in the front, middle, 
and back rows. In contrast to the inner-city 
children, the present pupils displayed a 
more limited repertoire of behavior in the 
classroom. Clearly, with limited variability 
both in the subjects and in their behavior, 
smaller differences might be expected in the 
present sample than in a more heterogenous 
group with a broader behavioral repertoire. 


CONCLUSIONS AND IMPLICATIONS 


Even in the schools used here, many of 
the anticipated outcomes of the seat assign- 
ment process were found. Although the 
focus of the present study has been on seat 
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assignments per se, parallel processes prob- 
ably occur in the classroom vis-a-vis whom 
the teacher calls on to respond to questions, 
whom she selects as monitors, and whom 
she assigns to which reading or arithmetic 
group. Each of these processes undoubtedly 
has consequences for pupils in terms of 
their self-views and educational experi- 
ences. 

In the classroom, a teacher generally has 
a group of from 25 to 35 pupils. In a group 
this size it is difficult for a teacher to be 
fully cognizant of the effect her actions and 
attitudes have on each individual. Her 
effectiveness as a school teacher, however, is 
a function of her ability to assess how her 
actions are affecting change in her pupils. 

A teacher is trained in designing and uti- 
lizing tests and other tools for assessing the 
extent, to which pupils are mastering the 
materials she is teaching. In most class- 
rooms academic progress is monitored regu- 
larly and the data so obtained is used by 
the teacher to shape teaching strategies. 
How she relates to her pupils in the process 
of realizing the academic objectives she has 
for her class is also important; a teacher, 
however, is not trained to assess how she is 
psychologically affecting her pupils or how 
her “psychological make-up” is shaping the 
opportunities she is providing. If she does 
not take these factors into account in plan- 
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ning her teaching strategies, she may un- 
knowingly effect changes in her pupils 
which are inconsistent with the educational 
goals she has. 
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CHILDREN’S CONCEPTIONS OF WORD BOUNDARIES IN 


SPEECH AND PRINT 


MARJORIE H. HOLDEN ann WALTER H. MacGINITIE* 
Teachers College, Columbia University 


Children’s conceptions of word boundaries in speech, and the corre- 
spondence between their conceptions of word boundaries in speech 
and print were investigated. Subjects were 84 children interviewed 
near the end of kindergarten. The children repeated an utterance 
while tapping a separate poker chip for each word. Fifty-seven of the 
children were also taught to identify word boundaries in print and 
were tested for their ability to identify a line of print containing 
the same number of letter clusters as words in an utterance. Identifi- 
cation of function words as separate words depended on context. Few 
children could segment both speech and print conventionally, but 
more could identify the number of letter groups corresponding to their 


own unconventional segmentation of speech. 


As early as 1923, Piaget (1955) noted that 
children “can make a correct use of certain 
difficult terms in their speech, and yet are in- 
capable of understanding these terms taken 
by themselves [p. 146].” Piaget concluded 
that awareness of the sentence precedes 
awareness of individual words. 

Writing in the early thirties, Vygotsky 
(1962) made a similar observation: 


In mastering external speech, the child starts from 
one word, then connects two or three words; & 
little later, he advances from simple sentences to 
Tore complicated ones, and finally to coherent 
Speech made up of series of such sentences; ... 
In regard to meaning, on the other hand, the first 
Word of the child is a whole sentence. Semantically, 
the child starts from the whole, from a meaningful 
complex, and only later begins to master the 
primis semantic units, the meanings of words, and 
R divide his formerly undifferentiated thought into 
e units, The external and the semantic aspects 
9. speech develop in opposite directions—one from 
* particular to the whole, from word to sentence, 
jus the other from the whole to the particular, 
tom sentence to word. [p. 126] 
oe 


"The data for this lected in the 
paper were co 
Shrub Oak Elementary School in Shrub Oak, New 
pork, We would like to acknowledge the assistance 
© received from the principal, Joseph Ranellone, 
td the two kindergarten teachers, Judith Mayes 
"^d Virginia Senk. 
Hy Requests for reprints should be sent to Walter 
bia MacGinitie, Box 140, Teachers College Colum- 
niversity, New York, New York 10027. 


Karpova (1955) studied Russian pre- 
schoolers’ awareness of lexical units. A lexi- 
cal unit is to be understood here as a word, 
the conventional dictionary entry (except 
for affixes), the unit that is conventionally 
preceded and followed by a space in written 
language. 

A major finding of the Karpova study 
was that Russian children from the age of 
3V$ to 7 generally could not divide a sen- 
tence into its lexical units. Before the age of 
7, most of the children were able to distin- 
guish nouns and make a simple binary divi- 
sion of the sentence indicating the complete 
subject and the complete predicate. The 
children experienced the most difficulty in 
identifying and isolating prepositions and 
conjunctions. Karpova obtained these re- 
sults by employing two different methods 
— method involving concrete objects as 
counters and a purely verbal one. The con- 
crete method was apparently useful, inas- 
much as certain children could respond cor- 
rectly only when they were permitted to 
combine a verbal response with a motoric 
one. 

In a study employing 66 children between 
4% and 5 years of age, Huttenlocher 
(1964) presented half the children with the 
task of separating two-word sequences and 
half with the task of reversing the two 
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words in these sequences. The most difficult 
items for the children to separate or reverse 
were those that they would have been 
mostly likely to hear and use in everyday 
speech as language units, such as i£ is or red 
apple. 'The sequences that were easiest to 
separate correctly were those that arise sel- 
dom in ordinary spoken language, for ex- 
ample, man table. 

Chappell (1968) has also reported a 
study of children's awareness of lexical 
units. She considered not only the age of the 
children, but their sex and socioeconomic 
Status and found a significant difference in 
performance between socioeconomic levels. 
She did not, as Karpova had, employ con- 
erete counters for the children to use in in- 
dicating their responses. 

Karpova's work and the American stud- 
ies using different age groups suggest a slow 
development in the ability to isolate lexical 
units, beginning with concrete words occur- 
ring in the more open form classes and cul- 
minating with the abstract words that gen- 
erally belong to the more closed form 
classes, words such as articles and preposi- 
tions. 

Researchers in this area have uniformly 
used printing conventions as a standard for 
correct segmentation. However, children 
who are preliterate have no way of knowing 
whether himself, for example, is one or two 
words, even if they known that each of the 
morphemes can occur freely. The definition 
of what a word is varies from one linguist 
to another, and if an identical utterance is 
submitted to several authorities, the results 
may not always be in agreement. Extended 
discussion of this question can be found in 
Chomsky and Halle (1968), Greenberg 
(1957), Kramsky (1969), Nida ( 1949), 
Rommetveit (1969), and Van Wyk (1968). 
In segmenting an utterance, therefore, a 
child may differ from the standard printed 
representation because of idiosyncratic, lin- 
guistic, or developmental reasons. 

Goodman (1969) believes that confusion 
between words as written units and as natu- 
ra] units of speech has been responsible for 
the focus on words in the teaching of begin- 
ning reading. Early reading authorities im- 
plicitly assumed that beginning readers per- 
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ceived written words as natural units in 
oral discourse. If there is a discrepancy be. 
tween the printing convention of written 
English and preliterate children's intuitive 
identification of word boundaries, confusion 
and difficulty may arise for the beginning 
reader whose intuitive notions of lexical 
units conflict with their conventional repre- 
sentation. How will children who cannot 
identify words in discourse interpret spaces 
on a printed page? Do children who are just 
learning to read try to establish a relation- 
ship between their own conception of 
boundaries in speech and the spaces be- 
tween words in print? 

A small group of recent studies seems to 
show that children who are learning to read 
do, in fact, have difficulty understanding 
the conventions of word boundaries in writ- 
ten English. Hochberg (1970) reports a 
study by Hochberg, Levin, and Frail in 
which white spaces in running text were re- 
placed by a meaningless symbol. This in- 
creased the difficulty of reading for a group 
of advanced readers, but had hardly any ef- 
fect on the performance of beginning read- 
ers. Evidently the spaces between words 
were not particularly helpful to the begin- 
ning readers in identifying meaningful units. 
A study by Meltzer and Herse (1969) of 
39 first-grade children who had been in 
school for 2⁄2 months found confusion of 


the concept letter with the concept word, | 


and the assumptions that length of words or 
height of letters are determinants of word 
boundaries. x 
Other studies by Clay (1966) and Reid 
(1966) using subjects from different popu- 
lations and employing observational meth- 
ods support the generality of these find- 
ings that beginning readers do not under- 
stand. conventional word boundaries. Reid 
(1966) studied 12 children in an Edin- 
burgh school. The children came from à a 
riety of backgrounds, and were interviewe 
in the second, third, and fourth months 0 
their first year at school. Reid observed a 
lack of awareness that spaces indicate udi 
boundaries, confusion of the term mum " 
with the term word, and confusion of the 
term letter with the term word. Dowd 
(1969) replicated and extended Reid's WO 
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using 18 5-year-olds in England. Even the 
most advanced children, according to 
Downing, accepted phrases and sentences as 
examples of words. 

Clay (1966) observed 100 New Zealand 
children during their first year of school. 
She, too, noted that beginning readers had 
difficulty differentiating between letters and 
words, The subjects in this study were 5 
years old at the beginning of the year, and 
Clay noted that many were still confused at 
the age of 6. It seems likely that the ability 
to identify words in context may not coin- 
cide with the beginning of reading instruc- 
tion for all children. If it does not, reading 
may be more difficult for those children who 
tan respond to utterances only globally or 
holistically, rather than analytically. 

The aim of the present study was to ex- 
tend previous findings in two different 
areas. First, we attempted to study kinder- 
garten children’s conceptions of word 
boundaries in speech, using a greater vari- 
ety of utterance types than had previously 
been used. Second, we attempted to investi- 
gate the degree of correspondence between 
children’s conceptions of oral and written 
Word boundaries. The studies summarized 
above support the contention that a child at 
the beginning of first grade generally does 
not segment language into conventional 
Words in either the auditory or the visual 
mode. However, the possibility exists that 
children may make logical attempts to re- 
late print to their own oral segmentations, 
Specially if they are given some guidance 
in interpreting the spaces between words. 


EXPERMENT I 


Method 


b The subjects were 84 kindergarten children, 47 
an and 37 girls. They comprised the total 1967-68 

dergarten enrollment: of an elementary school in 
Northern Westchester County, New York. The 
Weeulation of the school is white and pre- 
pminantly middle class. The children ranged in 
Hs from 5 years, 4 months to 6 years, 8 months 
ds a median age 5 years, 11 months. They were 
poer the end of the school year in April and 
Ach of the children received from 12 to 20 
ved or short sentences to segment into words. 
riva Ples of these phrases and sentences will be 

“tn the presentation of results. The first 33 
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children received the same set of items, then the 
set was revised and expanded for the next 24 
children and again for the last 27 children. Since 
some of the items were common to more than one 
of these sets, the number of children who received 
2n item varied from item to item. This number 
will be specified for each item described in the 
results. The changes introduced in the later ver- 
sions of our instrument reflected our attempts to 
verify patterns suggested by the earlier sets of 
responses, There were no systematic differences in 
the groups of children receiving the three versions, 

The children were tested individually. The 
child was seated at a desk opposite the examiner, 
and told he was going to play a "talking and 
tapping game." Then he was shown eight poker 
chips aligned horizontally in front of him. The 
procedure began with a demonstration of the 
game using the model utterance, Elephants live in 
the zoo. This utterance was effective in demonstrat- 
ing that neither syllabication nor compounding of 
words was correct. The examiner showed each 
child how she moved her finger from one poker chip 
to the next as she said each word. After this ex- 
ample, three tape-recorded sample items were ad- 
ministered. Assistance was given with these items 
when the child made an error or hesitated unduly. 
During the test itself, the utterance was played, 
and the child was asked to repeat it. If the child 
did not repeat the utterance correctly, it was 
played again, several times if necessary, When the 
child had repeated the utterance correctly, he then 
repeated it again, tapping one chip for each word. 
Only this final repetition with tapping was scored. 


Results and Discussion 


As with Chappell (1968) and Karpova 
(1955), our results showed that function 
words were more difficult to isolate than 
words that had more lexical meaning. The 
most common error made by the children in 
our study was compounding a function 
word with the following content word. 

Because we used a different methodology 
than Chappell or Karpova, our results are 
not strictly comparable to theirs, but some 
differences are noteworthy. Binary division 
between subject and predicate was a more 
common occurrence in Karpova’s study 
than it was in ours, where it occurred only 
13 times in 603 opportunities. This differ- 
ence may reflect differences. between the 
Russian and English languages or in the 
utterance types that were sampled. Chap- 
pell, using only active declarative sentences, 
noted that her subjects’ segmentations coin- 
eided with phrase boundaries in almost all 
instances—regardless of whether or not the 
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words within the phrase were isolated as 
lexical units. In our study, combining words 
across phrase boundaries occurred fre- 
quently. For example, The book is in the 
desk was segmented as The book/ is in/ the 
desk, perhaps as a result of spontaneously 
imposing a rhythmic pattern on the utter- 
ances. 

Our subjects’ awareness of small function 
words as free forms appeared to depend at 
least partially on the context in which the 
word was used. Thus, in the utterance You 
have to go home, the word to was com- 
pounded by 12 of 33 children with have 
(You/ haveto/ go/ home)? and by 5 of the 
83 children with go (You/have/togo/ 
home). In contrast, in the utterance The 
dog wanted to eat, not one of 24 children 
compounded the to with wanted. However, 
10 of the 24 children did compound to with 
eat in this sentence. 

The children’s handling of the word is 
also depended on how the word was used. 
When the word is was used as a copula in 
the sentence Snow is cold, 23 of 27 children 
(85%) were able to isolate the is (Snow/ 
is/cold). Even when the sentence was trans- 
formed to the interrogative, Is snow cold?, 
21 of the 27 children (78%) were successful 
in isolating the verb. However, 23 of 51 
children (45%) compounded the auxiliary 
is with the progressive form of the verb 
drinking in the sentence Bill is drinking 
soda (Bill/isdrinking/soda), and 4 more 
compounded the auxiliary with the preced- 
ing noun (Billis/drinking/soda). When the 
sentence was transformed to the interroga- 
tive, 33 of the 51 children (65%) were un- 
able to isolate the auxiliary is (IsBill/ 
drinking/soda). One may guess from this 
last example that the child’s sensitivity to 
the rhythmic aspects of an utterance may 
indeed influence the way he segments it. 

On almost every item where the word the 
appeared, more than half of the group com- 
pounded it with either the following or the 
preceding word. In the sentence Houses 
were built by the men, 2 of 27 children com- 


^In such parenthetical examples, various errors 


other than the compounding being illustrated are 
not shown. 
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pounded the with the preposition by, 
whereas 12 of the children compounded it 
with the word men (Houses/were/built/ 
by/themen). In the case of the 2 children 
who combined the with by, rather than with | 
the following content word, the rhytmic | 
pattern of the sentence may again have 
been an influence: Hoüsés wére built by thé 
món, Whether some responses are in fact, 
based on rhythm, and what characteristics 
of the sentence, the child, and the experi- 
mental situation increase the likelihood of 
such responses are questions that remain to 
be investigated. 

In general, the greater the proportion of 
content words in an utterance, the greater 
the percentage of correct segmentations, 
This relationship is illustrated by two simi- 
lar utterances: The dog wanted bones and 
The dog wanted to eat. The dog wanted 
bones was correctly segmented by only 5 
children out of 51 (10%). An additional 34 
children (67%) did it correctly except for 
compounding the with dog. In the sentence 
The dog wanted to eat, 1 child out of 24 
(4%) segmented the entire sentence cor- 
rectly. An additional 7 (29%) segmented it 
correctly except for compounding the with 
dog. An additional 9 children (37%) not 
only compounded the with dog, but also 
compounded to with eat. Thus, if only con- 
tent words are considered, about three- 
fourths of the children segmented each of 
these sentences correctly. 


Exprrment II 
Method 


The correspondence between children's con- 
ceptions of oral and written word boundaries was 
studied with the first 57 of the 84 kindergarten 
children who participated in the “talking s 
tapping game.” Each of these 57 children was givi 
at least a dozen utterances to segment before ub 
attempt was made to see if he could iden : 
the visual representation of an utterance. nus " 
child was familiar with the auditory phase ae 
testing procedure before the examiner attemp! t the 
discover the nature of the correspondences tha! Ar 
child inferred between segmented utterances 
their written versions. t " 

After the child had become familiar with di 
tapping procedure, he was asked, after pu Very 
to count the number of chips he had PE hough 
few children had any difficulty doing this, 
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TABLE 1 
TUPO STIMULUS UTTERANCES WITH NUMBER OF CONVENTIONAL AND CONGRUENT RESPONSES FOR EACH 
Conventional t Unscorable Both con- Neith: 
Item but not but not E can deu 
congruent conventiceial Becr Aer re oen 
Form I (n = 33) 
Sample: Drink milk. 
Sample: Bill catches the ball. 
1. Channel five. 2 6 1 
2. Eat your candy. 2 7 1 i i 
8, Red balloons are pretty. 10 4 5 5 9 
4, John has enough money. 2 4 9 8 10 
5, Jane wanted pretty shoes, 2 1 9 14 7 
6, Children like chocolate candy. 1 1 9 8 14 
Form II (n = 24) 
Sample: Drink milk. 
1, Channel five. 
2, Eat your candy. 
3. Red and green balloons popped. 
4. Children like chocolate candy. 
5. Buy a pair of gloves. 


a few children had some problem with the longer 
items. Help was given as the examiner judged nec- 
essary. The point was to assist the child to deter- 
Mine accurately how many segments he thought 
there were in the utterance he had just heard. After 
the count of segments had been made, the child 
was shown a 5 X 8 unlined card. Each card con- 
tained four one-line sentences typed in primer 
type. The child was asked to indicate the line or 
tow that had the same number of words as he had 
Just counted in the recorded utterance. 
Each of the four one-line sentences contained a 
erent number of letter clusters. In the first form 
of this test, which we designated the Test of Under- 
piding of the Printing Convention (TUPC, 
orm I), the choices on the cards were written in 
Nonsense words. Nonsense words were chosen to 
avoid confusing or assisting any children who had 
el stock of sight words. Thirty-three subjects 
Sok this form of the test. This format did con- 
tat Some of the subjects, however, because they 
Teceived some prior training in sound-symbol 
wrrespondences. A revised form (Form II), which 
Mal administered to the remaining 24 subjects, used 
i actual words of the utterance. On this revised 
ney the same one-line sentence used in the audi- 
ci Phase of testing was typed on the card four 
mes, Only one of the four lines accorded with the 
Tallin convention. The other three lines each 
* ntained a different, number of letter clusters. This 
as achieved by dividing words into syllables or 
Ps two or more words together (e.g., Red and 
x en bal loons popped or Redand green balloons 
ionem. Table 1 contains the sentences that were 
sted by Forms I and II. 


Results and Discussion 


The subjects in this study seemed to be 
quite unaware of the printing convention. 
Their lack of knowledge may have been re- 
lated to the classroom emphasis on phonics, 
which concentrated on letter symbols rather 
than words. Because of this possibility, and 
because the study was exploratory in na- 
ture, the children were taught about the 
printing convention and assisted with the 
sample items. Therefore, the score obtained 
by a child is one that confounds knowledge 
of the printing convention with one that 
measures its teachability. 

Teaching about the printing convention | 
involved presenting the idea that a word, 
for example, the child’s first name, was 
composed of a certain number of letters; 
that his second name was also composed of 
a certain number of letters; and that the 
names were spoken without an extended 
pause in ordinary speech, but were written 
with a visible space between them. 

A response was scored as “congruent” if 
the number of letter clusters in the chosen 
line corresponded to the number of taps the 
child had given. A response was scored as 
“eonventional” if the number of letter clus- 
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ters in the chosen line corresponded to the 
number of words in the line as it would 
ordinarily be printed. A number of re- 
sponses could not be scored for congruence 
because the child tapped an unforeseen num- 
ber of times, and the appropriate 5 x 8 
card included no visual representation cor- 
responding to his segmentation of the utter- 
ance. In those cases where a child's segmen- 
tation of the utterance corresponded to con- 
ventional words, a congruent response 
would also be conventional. The results for 
individual items are shown in Table 1. 

In order to classify the subjects into 
those who responded primarily in terms of 
conventional printing and those who re- 
sponded primarily in terms of the congru- 
ence of the printed segmentation with their 
own segmentation of the utterance, a child 
was classified as responding conventionally 
if he gave at least four conventional respon- 
ses and as responding congruently if he 
gave at least four congruent responses. 
There were some children who gave at least 
four responses that were both conventional 
and congruent. 

Only 5 of the 33 subjects who received 
Form I of the TUPC and none who received 
Form II segmented four or more of the ut- 
terances conventionally and also identified 
the corresponding visual representation 
(responded both conventionally and con- 
gruently). No children in either group con- 
sistently chose the conventional representa- 
tion of an utterance if they did not also 
segment it correctly in the auditory mode. 
Several of the children, however, were evi- 
dently able to base their responses on a cor- 
respondence between their own segmenta- 
tion of the utterance and the visual repre- 
sentation of it as requested in the instruc- 
tions. Three subjects who received Form I 
and nine subjects who received Form II re- 
sponded congruently (but not convention- 
ally) on at least four items. Still, there were 
18 subjects receiving Form I and 7 receiv- 
ing Form II who did not respond consist- 
ently either conventionally or congruently. 
(The remaining 15 subjects gave some re- 
sponses that were unscorable for congru- 
ence.) 
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CONCLUSIONS 


It is evident that a great many children 
at the end of kindergarten are not familiar 
with the printing convention. Those child- 
ren who did understand after brief instruc- 
tion, that spaces between words function as 
indicators of word boundaries, tended to di- 
vide utterances into units that did not cor- 
respond to traditional printed words, 
Clearly, a first-grade teacher cannot take 
for granted that children will understand 
her when she talks about “words” and their 
printed representation. Nor can she assume 
that the concepts can be quickly and easily 
taught, since printed word units do not cor- 
respond to the way the child thinks the ut- 
terance should be divided. While children’s 
conceptions of word boundaries may fluc- 
tuate from one context to another, it intui- 
tively appears that the conception often re- 
flects linguistic rather than conventional 
definitions of words. Further studies on dif- 
ferent groups of children need to be done to 
determine the value to beginning readers of 
specific instruction with regard to utterance 
segmentation and printing conventions. 

The children’s performance on these tasks 
of language segmentation raises theoretical 
questions about the nature of first language 
learning. Do the units that the child ab- 
stracts from experience with the language 
correspond to the units that generative 
grammars imply he should be able to ma- 
nipulate? Studies should be undertaken to 
explore this question fully and to determine 
how the ability to abstract language units 
from auditory context is related to the de- 
velopment of cognitive structures. 
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A study was conducted to test the hypothesis that high-curiosity 
children exceed low-curiosity children in recognition of verbal ab- 
surdities. The study was based on 191 fifth-grade predominantly 
white suburban children. The children were rated on curosity by 
their teachers, their peers, and themselves. A 51-item test of verbal 
absurdities was administered to all of the children. Children in the 
upper-half and upper-third in curiosity were significantly better in 
recognizing verbal absurdities than were their low-curiosity counter- 
parts. These results may indicate that high-curiosity children seek 
more when they read and thus are able to see small absurdities. 


Interest in curiosity, as a subject for study 
and research, has increased rapidly in recent 
years (e.g., Berlyne, 1960; Day, 1967; Lang- 
evin, 1970; Maw & Magoon, 1971). There 
has not been a corresponding amount of re- 
search in the area of verbal absurdities. For 
the most part, absurdities, as such, have been 
classified under humor since, as Leacock 
(1959) has pointed out, it is difficult to draw 
distinctions among the humorous, the ab- 
surd, the funny, the foolish, the comical, the 
nonsensical, and the ridiculous “because 
such distinctions lack a scientific basis” 
and “the terms have a latitude and shift of 
Meaning which renders real classification 
impossible.” 

Since the recognition of absurdity prob- 
ably is a necessary part of much, if not all, 
humor, Studies bearing on absurdities may be 
considered under the broader rubric of hu- 
mor. The study of humor has been of interest 
for many years (e.g. Allport, 1937; Levine, 
1969; Rosenwald, 1961, 1964). A review of 
this literature indicates that certain variables 
including those which Berlyne (1960) calls 


1 Requests for reprints should be sent to Wall 
H. Maw, Willard Hall Building, College of Edu. 
cation, University of Delaware, Newark, Dela- 
ware 19711. 


collative variables are associated with curi- 
osity, with humor, and no doubt, with verbal 
absurdities too. i 

Earlier, Freud (1959) had reported evi- 
dences of curiosity which he labeled scopto- 
philia, or the desire to look around. This idea is 
incorporated into the definition of curiosity 
used in this study in that a high-curiosity child 
is viewed as one who “scans his surroundings 
seeking new experiences." Freud also wrote 
at length about humor (1960) relating it to 
aggression, sex, dreams, the unconscious, 
and other aspects of psychoanalytical theory: 
Levine (1969) in discussing Freud’s ap- 
proach to humor, said that it permitted “one 
to think like a child and thereby escape the 
constraints of rationality and logic of cogn- 
tive functioning.” In this sense, humor 18 
liberating. The effect is somewhat similar to 
that of security which Maw and Maw (1965) 
found to be related to curiosity. The secure 
child could dare to explore further and fur- 
ther into his environment. 

Allport (1937) reported that the most 
striking correlate of insight is a sense of m 
mor. Epstein and Smith (1956) found tha 
the two are related and they hypoth h 
that the relationship is found only when the 
jokes are about the person himself. It ap- 
pears that only those persons approaching 
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self-actualization (Maslow, 1956) or those 
who feel highly self-worthy (Maw & 
Magoon, 1971) are uninhibited enough and 
free enough, from outside pressures to have 
good insight not only into themselves but 
also into other areas. These persons also rate 
high in curiosity (Maw & Maw, 1965). 

Like curiosity, humor is a paradox. While 
one writer may indicate that humor gives 
one an opportunity to escape cognitive con- 
finement, other investigators (e.g., Freud, 
1960, and Levine, 1968) have suggested that 
the pleasure of using one’s cognitive processes 
in relation to humor contributes to grati- 
fication. Zigler et al. (1967) indicate that 
"the sheer ability to comprehend a joke . . . 
may be related to what White (1959) has 
referred to as the effectance motive." If this 
Statement is valid, one would expect the 
greatest reaction to humor “at the upper 
limits of the individual's cognitive ability." 
{YZigler et al. (1967) conclude “that (a) the 
degree of congruence between the complexity 
of the humor stimulus and the complexity 
of the observer's cognitive structure, and 
(P) the strength of the observer’s motive 
after mastery” determine the strength of the 
humor response. The congruity-incongruity 
continuum which seems to underlie humor 
was investigated by Godkewitsch (1968) 
in the Netherlands. He believes that incon- 
gruity of some degree is basic in humor. 
Berlyne (1960) considers incongruity basic 
curiosity and humor. 

It is probable that many personality 
variables will be found to be related to both 
curiosity and humor. According to Wilson 
and Patterson (1969), a conservative person 
18 more likely to respond positively to and to 
appreciate “safer jokes” than a more li 
individual. Guttman and Priest (1969) show 

t social perception plays an important 
Tole in humor. In studies by Maw (1967) 
and W. Maw and Magoon (1971), it was 
shown that high-curiosity children are apt 

be more socially oriented than their low- 
curiosity counterparts. 
q ID the present study, it was hypothesized 
hat children rated high in curiosity would 
Tecognize verbal absurdities significantly 
more frequently than would children rated 
OW in curiosity. 
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Since the concept of curiosity is not the same 
for everyone, it was necessary to define it more 
precisely for this investigation. After making for- 
mal and informal interviews to ascertain what 
people thought about children’s curiosity, re- 
viewing literature on the topic, and studying older 
and modern dictionary definitions of the word, it 
was concluded that an elementary school child 
manifests curiosity to the extent that he (a) reacts 
Positively to new, strange, incongruous, or mys- 
terious elements in his environment by moving 
toward them, by exploring them, or by manipu- 
lating them; (6) exhibits a need or a desire to know 
more about himself and/or his environment; (c) 
scans his surroundings seeking new experiences; 
and/or (d) persists in examining and exploring 
stimuli in order to know more about them (Maw 
& Maw, 1964). 

In order to establish criterion groups of high- 
and low-curiosity children, pupils were rated by 
their teachers, their peers, and themselves on in- 
struments based on the above definition. Teachers 
ranked their pupils from high to low. The index of 
stability of these rankings, was found to be .77. 
The peers matched their classmates to character 
sketches derived from the definition. A child’s 
score was a weighted sum of the times his name 
was listed. These scores were transformed to 
McCall T scores. In order to obtain an estimate of 
reliability for this instrument, each class partici- 
pating in an earlier study (Maw & Maw, 1964) 
was divided into two parts. The children were 
ranked according to the ratings each received 
from each half of the class. The correlation ob- 
tained between the two halves was .45. The corre- 
lation between peer ratings and teacher ratings 
was .54. The pupils judged themselves on a 41- 
item instrument which gave them an opportunity 
to express how frequently they did or did not par- 
ticipate in certain types of behavior. An earlier 
index of consistency obtained for this instrument 
was .91 (Maw & Maw, 1968). The criterion groups 
that were finally chosen were based on a composite 
rating in which all three evaluations were given 
equal weight. i 

Since there was a moderate correlation between 
intelligence and curiosity (r = .55), and since in- 
telligence might also be a factor in recognizing 
verbal absurdities, the pupils were matched for 
verbal IQ, nonverbal IQ, and age. Intelligence 
was by the Lorge-Thorndike Intelli- 
gence Tests. Four groups were formed; upper- and 
lower-halves of the distribution and upper- and 
lower-thirds. Table 1 contains descriptive data 
for the curiosity groups. 

The study was based on a sample drawn from 
school districts specifically selected because their 
pupils were considered to be fairly representative 
of elementary school children generally. The 
children were fifth-grade pupils attending New 
Castle County, Delaware schools, There were no 
children from the City of Wilmington in the study. 
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Nomper, Mean AGE, AND Mean LORGE- 
THORNDIKE VERBAL AND NONVERBAL IQs 
FOR UPPER-CURIOSITY AND LOWER- 

CunrosrTY GROUPS 


WALLACE H. MAW AND ETHEL W. MAW 


For the most part the children were white; the 
numbers of children from other races were insig- 
nificant. 

There was no effort to control for sex differ. 
ences. An examination of the selected criterion 
groups showed very little imbalance between the 
sexes. In the upper third of the curiosity group, 


nisi crm Mean there were 21 boys and 10 girls; in the lower third, 
Curiosity group N age veel merges there were 17 boys and 14 girls. In the upper half, 
there were 32 boys and 19 girls; in the lower half, 
More qp there were 22 boys and 29 girls. When tested, using 
Upper half 51 | 128.92 | 103.71 | 100.61 the chi-square technique, these differences were 
Lower half 51 | 127.42 | 103.80 | 100.24 found to be insignificant. 
Upper third 31 | 129.03 | 103.29 | 101.71 A bl-item test of verbal absurdities was de- 
Lower third 31 | 129.54 | 103.32 | 99.97 veloped. Some of the items were common absurdi- 
ties; others were straightforward statements, as 
Note.—Mean age is in months. can be seen in the test, itself, that follows. 
FooLIsH SAYINGS 
Some of the following statements have parts in C 15. The crowd pushed forward to see what 
them that make them foolish. Other statements had happened. They made it impossible 
are all right. Put an X before the foolish ones; and for the rescue teams to work. 
a C before those that are all right. X 16, Abraham Lincoln once said that in his 
X 1. Bob said to Jack, I'll meet you at the —— opinion Lindberg was a greater man 
lodge. If I get there first, I’ll make a than Napoleon. 
chalk mark on the door. If you get there C 17. Benjamin Franklin said, ‘We had bet- 


first, rub it out." 

2. He held the flashlight so I could see to 

change the tire. 

8. Bob worked fast so he could fill the 

baskets before he ran out of apples. 

4. The soldiers were outnumbered so they 

gave up without a fight. 

5. Aman who was charged with driving at 
the rate of 70 miles per hour insisted 
that he could not be guilty since he had 
been out driving for only one-half hour. 

. The sunlight pouring in through win- 
dows on all four sides of the room made 
such a glare that I hurried to pull the 
blinds. 

X 7. If you can't read type of this Size, you 
need glasses. 

C 8. If you can’t swim, you should learn. 

C 9. She hurried to school so she would not 
be late for her first class. 

. If you can't open the jar, put the lid 
back on and place it in the refrigerator. 

- “There is nothing I like better than 
dessert," she said as she took her sec- 
ond piece of apple pie. 

. Jane wanted to call the new boy in her 
class, but couldn't remember his name. 
She called the operator and asked her 
to look up the boy's number. 

- If each of us would give ten cents, we 
would have enough money to buy the 
baseball. 

. There have been no girls born in my 
family for the past three generations. 


ter hang together or we will hang sepa- 
rately.” 

. The ladder broke when my father was 
painting the house. Luckily he was not 
far from the ground when it happened, 
so he was not hurt. 

. On their trip to the fire station, none of 
the little children wanted to be last in 
line. The kind teacher solved the prob- 
lem by letting the last child in line come 
up and walk with her. 

. The judge ruled that the man could not 
marry his widow's sister. 

. Russia is located 100 miles from the 

United States. : 

The teacher said to the pupil, «Write 

the answers to the first five questions 

on the blackboard. You should finish by 

the time I return." à 

. Why is Sam Brown so stingy with his 

brother who lives with him? He has 80 

much wealth that he could give his 

brother half of it and still be the richest 
man in town. 

The soldier said, “If I have to choose 

between spending the rest of my life in 

jail or in the grave, I'll choose jail. 

. His uncle is twice as old as my father 
was when I was born. E 

. The rancher and the cattle rustler € 
changed several shots before the gu 
fell dead. The sheriff said that m 
rancher's first shot had jammed t 
rustler's gun. 
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- My sister was in town shopping when 


she remembered she had to be at a 
meeting at 3 o'clock. She called to ask 
me to meet her at 2:30 and drive her to 
the meeting. 


- John wanted to be a basketball player. 


He practiced an hour every evening 
before he started his homework. 


. My grandparents have no daughters and 


no sons. 


. She picked up the melted ice cubes and 


dropped them into the pail. 


. She swept the rug and then rolled it to 


one side of the room, so the boys and 
girls could play games. 


. The breeze made the branches of the 


trees sway and the waves lap softly 
against the bank. 


. Helen said, ‘I have three sisters: Mary, 


Susan, and myself." 


- Mrs. Jones mailed her order to the store. 


At the bottom of the order she wrote, 
“Please let me know if you do not re- 
ceive this order.” 


. Father said, “Come to the table and I 


will show you how to play the game." 


- The captain said, “Don’t shoot until 


you see the whites of their eyes." 


- The old man climbed three flights of 


Stairs, turned right and opened the 
third door from the end of the hall. 


- It is a week from Christmas to New 


Year's but almost twelve months from 
New Year's to Christmas. 


. He was such a popular child star that 


although he died before he was ten 
years old, he left enough money to make 
all of his grandchildren wealthy. 


. The teacher said, *"The earth is moving 


at the equator almost 1000 miles per 
hour.” 


. John spent half of his money, but still 


had more left than he had spent. 


- A wheel came off the car while my 


mother was driving. She didn’t know 
how to put it on by herself, so she drove 
to the nearest station for help. 


- The family had taken a long vacation. 


They told their neighbor that they had 
taken over 200 pictures. 


. He was born in a log cabin but he be- 


came President of the United States 
before he died. 


- Give me my glasses and turn out the 


light so I can read the newspaper. 


- "I am very fond of dogs,” he said as he 


patted the puppy's head before kicking 
it out of the way. 


+ “Shoot if you must, this old gray head, 


561 


but spare your country’s flag,” she 
said. 

. Heran quickly up the stairs and hurried 
down the hall. He was hoping he would 
be late for his first class. 

. À tremendous roar accompanied the 
thunderbolt that knocked out the power 
station. In the hush that followed, the 
only sound was the whir of the electric 

‘an. 

. The first man to climb to the top of the 
mountain found there the remains of 
what was once an Indian look-out sta- 
tion. 

- The man who was stopped for speeding 
told the policeman that he was hurry- 
ing to reach a filling station before he 
ran out of gas. 

The reliability of the 51-item test was estimated 

to be .77. If the test were to be expanded to 100 
items, the coefficient would approach .87. It is, 
therefore, suggested that if the instrument is to 
be used in further studies, the number of items 
should be increased keeping the ratio of absurdi- 
ties to straightforward statements the same as in 
the 51-item test. 


RzsurTS 


Table 2 shows the results of the study. As 
had been predicted, children higher in curi- 
osity recognized more of the absurdities than 
did children lower in curiosity. The mean of 
the high-curiosity group exceeded the mean 
of the low-curiosity group by 5.29 points for 
thirds and by 4.29 points for halves. For 
thirds, ¢ was 2.80 (p < .005); for halves, ¢ 
was 2.50 (p « .01). The curiosity groups 
were not significantly different in variability. 
"The differences between the means of scores 


TABLE 2 
MEAN, VARIANCE, AND RESULTS OF TESTS OF 
SIGNIFICANCE OF DIFFERENCES BETWEEN MEANS 
AND BETWEEN VARIANCES OF SCORES ON A 
VERBAL ABSURDITIES TEST FOR CURIOSITY 
Groups AND FOR Boys AND GIRLS 


Group N M Variance F t 
High third | 31 | 23.29 | 51.66 | 1.31 | 2.80** 
Low third |31| 18.00 | 67.80 
High half |51| 23.06 | 53.76 | 1.36 | 2.50* 
Low half | 51] 18.77 | 73.17 
Boys 95 | 23.05 | 67.57 | 1.03 | .55 
Girls 96 | 22.40 | 65.60 

*p< 01 
** p < .005. 
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of boys and girls was less than one score 
point and was not significant. 


Discussion 

Since high-curiosity children obtained 
significantly higher scores on the verbal 
absurdities test than did low-curiosity chil- 
dren, it might be concluded that high-curi- 
osity children are more aware of small differ- 
ences than are low-curiosity children. This 
finding is in accord with part of the definition 
of curiosity used in this study, specifically 
the idea that high-curiosity children keep 
scanning their environments looking for new 
information. 

It might also be concluded from this study 
that high-curiosity children are more sensi- 
tive to humor than are low-curiosity chil- 
dren. It is likely that humor did enter into 
the labeling of statements as foolish by 
many, if not most, of the children. Certainly, 
Tecognition of the absurdity of the state- 
ments is basic to finding them funny. The 
differences found in the study, however, were 
in recognition of absurdities, not in seeing 
the humor in them. There was laughter as 
the children made their selections, but no 
effort was made in the study to determine 
the number of humor responses nor their 
degree. Therefore, it is not appropriate to 
conclude that the groups differed in sense of 
humor. Further study is needed to clarify 
this point. Such studies might ask pupils 
not only to indicate which items they con- 
sider to be foolish but also which they con- 
sider to be funny. 
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EFFECT OF DIFFERENTIAL COLLEGE EXPERIENCES IN 
DEVELOPING THE STUDENTS’ SELF- AND 
OCCUPATIONAL CONCEPTS. 


KEITH J. EDWARDS? BRUCE W. TUCKMAN 


Johns Hopkins University Rutgers University 


The self-concept, social role, and occupational identification of 68 uni- 
versity liberal arts students were compared to those of three groups of 
community college students before and after 2 years of college expe- 
rience, Multivariate and univariate analyses of variance and discrimi- 
nant analysis revealed that the community college groups initially had 
lower self-esteem and identified with lower status occupations (p < 
.05) than the university liberal arts group. Two years later, the differ- 
ences in self-esteem did not exist, the community college liberal arts 
students identified with higher status occupations, and the technical 
and business students identified more closely with their respective 
occupations. Within-group change in terms of occupations with which 
the students most, closely identified was evident in the community col- 
lege liberal arts and business. groups (initial-final rank-order correla- 
tions of .66 and .50, respectively) ; and rank order of occupational iden- 
tification for the technical and university liberal arts groups was quite 
stable (initial-final rank-order correlations within groups of 88 and 89, 


respectively). 


The purpose of the present study was to 
determine the extent to which students at- 
tending a community college differ from 
their university counterparts in terms of 
their self-concept and perceptions of social 
and occupational roles at the outset, and to 
examine the degree to which these initial 
differences were amplified or eliminated as 
à result of 2 years of differential college 
experiences. 

In their review of the literature, Mc- 
Cullers and Plant (1964) suggested that re- 
Search in higher education had eliminated 

college experience as an independent vari- 
— 
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able” in personality change. However, Ska- 
ger, Holland, and Braskamp (1966) found 
positive relationships of college characteris- 
tics with self-ratings and life goals. The lat- 
ter authors concluded that their success was 
due to the nature of the variables studied 
—direct self-descriptions rather than per- 
sonality scales—and argued that these are 
more relevant to the problem. The present 
study of college effects employed similar 
self-description variables. 

One can question the effect that the com- 
munity college experience has on the ways 
the student perceives himself and various 
occupational and social roles. The low so- 
cial status of the community college rela- 
tive to the university may be detrimental to 
the student’s self-concept, as well as his so- 
cial and occupational identification. Propo- 
nents of the community college, on the 
other hand, argue that the experience allows 
the student to explore educational alterna- 
tives and conceive himself as a college stu- 
dent without the competitive pressure of the 
university. Thus, the student’s self-esteem 
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and views of occupational and social roles 
that may have hitherto been beyond his 
realm of possibility should be enhanced. 

Super's theories (1957, 1963) about self- 
concept, vocational development, and their 
interrelationship provide the bases for our 
hypothesis of a positive effect. Within his 
theoretical framework, Super (1957) de- 
fined five stages of vocational development: 
growth, exploration, establishment (imple- 
mentation), maintenance, and decline. 
Within the implementation stage (the one 
relevant to this study), the following activi- 
ties that can be enumerated are: (a) confir- 
mation and verification of choice, (b) pro- 
fessional identification, and (c) knowledge 
of self and role requirements. In the proc- 
ess, self-concepts are continually modified 
as new experiences are incorporated or as- 
similated into the individual’s cognitive 
structure. 

The 2-year college program not only al- 
lows more students and different types of 
students to have a college experience but it 
also provides occupationally relevant expe- 
riences for some students, which should con- 
tribute to their vocational development as 
part of the implementation stage. Since 
training in many ways provides a taste of 
an occupation, it can allow the student to 
test his choice, gain professional identifica- 
tion, and gain knowledge of himself and the 
Tole equirements of his occupation-to-be. 
Not only should the 2-year college program 
lead the student to consider occupations of 
more diversity and greater status than he 
might have heretofore, but it should also 
allow Students to increase the specificity of 
their career goals, particularly those stu- 
dents enrolled in occupational programs. 
Thus, the 2-year college experience is ex- 
pected to increase self-esteem, level of aspi- 
ration, and Specificity of occupational con- 
cepts (ie., closer identification with a cho- 
sen field). These outcomes would be consis- 
tent with Super’s formulations on voca- 
tional development. 


Metuop 


This study was both longitudinal and cross-sec- 
tional in nature. Students enrolled in community 
college programs were followed over a 2-year pe- 
riod, and were also compared to university students 
8t the beginning and end of this period. 
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The Multiple Repertory "Test, as developed and 
used by Starishevsky and Matlin ( 1963). Bingham 
(1966), and Rampel (1907) was used as the de- 
pendent variable. Students were given a form of 
the Role Construct Repertory Test (RCRT; 
Kelly, 1955) by which they created 12 pairs of 
bipolar adjectives. The purpose of this Step was to 
obtain a semantie space consonant with the sub. 
ject’s frame of reference. The subjects then rated 
20 concepts on 7-point scales using the 12 bipolar 
adjectives generated on the RCRT. The dependent 
variables were created by taking the absolute dif- 
ference between each of the 12 adjective scale rat- 
ings on ^I am" and the corresponding scale ratings 
for one of the other concepts and summing over 
the 12 adjective scales. This procedure yielded a 
discrepancy score between “I am" and each of the 
other 19 concepts. The discrepancy scores thus 
calculated could range from 0 (ratings on both con- 
cepts identical for each scale) to 72 (ratings on the 
two concepts at extreme opposite ends for each 
scale; ie. /1 — 7/ 2/7 —1/ = 6; 6 X 12 scales = 
72) 


The 19 dependent variables (19 discrepancy 
Scores) fell into three categories: 


1. Self-esteem which was measured by the dis- 
crepancy score between ‘‘I am" and the con- 
cept: 

I wish I were. 


2. Social-role incorporation measured by the dis- 
crepancy scores between “I am" and the four 
concepts: 

High Society 
Outstanding Citizen 


Cultured Person 
Community Leader. 


3. Oceupational incorporation measured by (he 
discrepancy scores between “I am” and the 


14 concepts: 
Teacher Clerk 
Doctor Bookkeeper 
Lawyer Electrician/Plumber 
Accountant Truckdriver ——. 
Engineer Mechanic/Machinist 
Technician Policeman/Fireman 
Business Executive Salesman. 


It should be noted that the greater the Se ci 
or the level of incorporation of an occupation! : 
social-role concept, the smaller the discrepancy 
Score would be. : UG 
While no reliability data were availble su 
present study, a multitrait, multimethod ke a 
study of the Repertory Test as a measure of s! ay 
esteem was conducted by Silber and T ad 
(1965). The discrepancy measure of self-esteem ue 
high convergent validity with three other epe al 
of the same trait (difference between self a 5 um 
self, r = 81; global self-esteem, r = .67; eer aie 
view rating of self-esteem, r = .63). The ans in 
nant validities of the Repertory Test were mo 
the direction recommended by Campbell ad m 
(1970). (Heterotrait, monomethod r — 2; 23 
trait, heteromethod 7, = 30, ra = 41, rs = 12. type 
The independent variable for the study was 
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of college experience. The four levels of this varia- 
ble were: (a) community college liberal arts 
(CCLA), (b) community college technical (CCT), 
(c) community college business (CCB), and (d) 
university liberal arts (ULA). 

The subjects were males, all of whom had gradu- 
ated from high schools in the same county in cen- 
tral New Jersey in June of 1967. The MRT was 
administered to subjects in each of the four groups 
in September of 1967. The sample sizes by group 
were CCLA = 74; CCT = 63; CCB = 117; and 
ULA = 93. For the first three groups the subjects 
were a random sample of a larger group tested. The 
93 ULA subjects represented all-male university 
liberal arts students from the county. 

The community college studied was approxi- 
mately 1 year old at the time of the testing. The 
incoming freshman class numbered about 1,000. 
There were approximately 400 second-year stu- 
dents. The college was located on a suburban 
campus and was engaged in a building program. 
Liberal arts (transfer) and occupational (terminal) 
programs were available as well as a prevocational 
program, Business and technical programs included 
laboratory and classroom study and were 2 years 
in duration. 

The university studied was a public state uni- 
versity with an enrollment of about 23,000 students 
on all its campuses. Its major campus was located 
about 6 miles from the community college and 
contained an all-male undergraduate college with 
an enrollment of about 5,000 students and about 
1400 students in each incoming class. Liberal arts, 
agriculture, and engineering programs were 
available, 

The follow-up testing was carried out in April 
of 1969 on those subjects still available. The CC 
students were followed up by mail, while the ULA 
Subjects were tested in person. The sample size for 
this test was CCLA = 46; CCT = 25; CCB = 56; 
ULA = 68. This represented an experimental 
Mortality rate ranging from 26% to 60% per group. 
A Comparison of the total 1967 sample and the re- 
maining 1969 subsample on initial test means for 
each of the four groups indicated negligible bias 
due to experimental mortality. (Two additional 
groups tested in 1967, noncollege and pretechnical, 
Were dropped due to differential experimental mor- 
tality.) Only subjects for which both tests were 
available were included in the study. 

The differences among the four groups for both 
the initial and final tests on the 19 dependent vari- 
ables were investigated by means of multivariate 
analysis of variance and multiple discriminant 
analysis. In this case, discriminant analysis was 
Used primarily to characterize group differences 
identified in the MANOVA rather than as a means 
of classification which is its more common use 
(Bock & Haggard, 1968). In addition, one-way 
Univariate analyses of variances were calculated for 
each of the dependent variables. The results of the 
analysis of the initial test data were then com- 
Pared with the results obtained in the follow-up 
data to determine the effect of the differential 
college experience on the relative standing of the 
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four groups. Changes in each group’s occupational 
identification over the 2-year period were examined 
by means of rank-order correlations between the 
initial and final discrepancy means for the 14 oc- 
eupations. 

It seems necessary at this point to add a note on 
why an analysis of change scores or an analysis of 
covariance was not chosen as the analytical tech- 
nique. The major reason was that neither of these 
techniques could yield direct answers to the major 
question of interest; that is, how do university 
and various community college students differ 
initially and how do they differ after 2 years of 
differential college experience? In addition, meth- 
odological arguments against the use of gain 
scores (Cronbach & Furby, 1970) and the use of 
an analysis of covariance with nonrandomly 
formed groups (Lord, 1967 and 1969) have been 
made elsewhere and need not be repeated. 


RESULTS 


Group comparisons by means of the mul- 
tivariate F statistic provide information 
about the differences among groups on all 
19 stimulus words simultaneously (Bock & 
Haggard, 1968). For the initial data, the 
overall MANOVA F test was significant (F 
= 1.70, df = 57/517, p < .005), indicating 
that the group mean vectors were different. 
Individual pair-wise multivariate F tests 
revealed significant differences between the 
ULA and CCB groups (F = 2.42, df = 
19/173, p < .01) as well as the CCT and 
CCB groups (F = 2.10, df = 19/173, p < 
.01). The overall F test for the final data 
was also significant (F — 1.82, df = 57/ 
517, p « .005). Three of the six pair-wise 
group comparisons were signficant; ULA 
versus CCB (F = 2.71, df = 19/178, p < 
.01); CCT versus CCB (F = 2.27, df = 
19/173, p < .01) ; and ULA versus CCT (F 
= 188, df = 19/173, p < 05). Thus, in 
addition to the initial group differences 
being present in the final data, the ULA 
and CCT groups became increasingly dif- 
ferentiated. The nature of these differences 
can be clarified somewhat by looking at 
group differences on each of the stimulus 
words individually and by examining the 
structure of the discriminant functions for 
each set of data. 

The initial and final discrepancy means 
for each of the four groups on the 19 stimu- 
lus words are given in Table 1. Univariate 
analyses of variance were performed for 
each of the 19 scales on both the initial and 
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TABLE 1 
INITIAL AND FINAL Discrepancy MEANS AND F STATISTICS FOR NINETEEN STIMULUS Worps 
Baiversty: 1 SAI gaa CC business F ratios 
Stimulus words 

Inital Final Initial | Final | Initial | Final Initial | Final Initia] | Final 
I Wish I Were 17.56 | 16.03 | 22.89 | 18.72 | 20.20 | 17.84 20.57 | 16.98 | 3.19* | .57 
High Society 27.26 | 27.24 | 25.02 | 23.93 | 24.40 | 28.12 | 23.18 | 22.32 | 2.24. | 3.62* 
Outstanding Citizen 20.48 | 20.87 | 22.41 | 19.30 | 20.12 | 21.28 22.78 | 18.88 | 1.31 | 1.01 
Cultured Person 20.93 | 18.96 | 20.61 | 19.30 | 21.76 | 21.20 22.78 | 19.54 | .66 | .56 
Community Leader 21.40 | 22.38 | 21.30 | 19.96 | 20.92 | 22.52 22.18 | 18.66 | .18 | 2.65* 
Teacher 21.53 | 20.09 | 19.70 | 18.70 | 21.12 | 20.04 | 22.38 | 20.66 | 1.03 | .54 
Doetor 21.82 | 22.15 | 22.48 | 22.54 | 21.88 | 20.32 | 22.34 | 21.88 | .06 | .45 
Lawyer 21.74 | 20.53 | 24.04 | 21.41 | 20.92 | 21.36 21.95 | 20.75 | .90 4 
Accountant 24.88 | 26.65 | 22.09 | 22.20 | 22.20 | 24.64 | 21.00 | 24.05 | 1.83. | 1.63 
Engineer 20.79 | 22.59 | 22.87 | 20.59 | 17.76 | 17.80 23.59 | 20.20 | 3.42* | 2.15 
Technician 21.97 | 24.70 | 21.17 | 20.37 | 16.44 | 15.56 | 21.70 | 21.43 | 3.52* | 7.22** 
Business Executive 23.88 | 25.15 | 22.30 | 21.37 | 21.76 | 23.40 20.19 | 18.46 | 2.17 | 6.72** 
Clerk 30.54 | 32.20 | 22.93 | 24.22 | 23.88 | 27.60 | 24.98 | 28.25 | 6.67 | 5.45** 
Salesman 26.20 | 28.31 | 20.96 | 21.43 | 22.64 | 24.64 | 23.39 | 21.12 | 3.32* | 7.93" 
Bookkeeper 28.00 | 28.93 | 22.50 | 23.93 | 23.88 | 25.04 22.09 | 27.28 | 4.72**| 2.51 
Electrician/Plumber 27.73 | 28.47 | 24.13 | 23.30 | 20.80 | 23.24 | 25.11 | 24.89 | 3.16* | 3.24* 
Truck Driver 32.54 | 33.62 | 26.85 | 28.43 | 29.80 | 28.92 | 28.57 | 28.48 | 3.48* | 3.45* 
Mechanic/Machinist | 26.38 | 28.10 | 24.00 | 23.91 | 22.12 | 21.92 | 25.91 | 28.46 | 2.06 | 4.88* 
Policeman/Fireman 26.88 | 28.17 | 23.63 | 22.65 dO 2 24.82 | 23.48 | 1.22 | 4.89** 


Note—Underlining indicates that difference between initial and final means is significantly different 
from zero, (p < .05, t test) two-tailed. Abbreviations: CC = community college. 


*p < 05 (F = 2.65 df = 3 91). 
**p < 01 (F = 3.88). in 


final data. The resulting F ratios are listed 
in the last two columns of Table 1. 

The outcome of the ANOVA with regard 
to the self-concept scale (I wish I were) 
showed initial significant differences among 
the four groups. The ULA group had the 
lowest mean discrepancy between the real 
self (I am) and the ideal self (I wish I 
were), while the OCLA group had the high- 


est. However, the between-group differences 
on this scale for the final tenes 
Significant. Analyses of the within- 
change scores indicated that the CCLA and 
CCB groups had changed significantly i 
the 2-year period (two-tailed t test, P 
.05). 

There were no initial significant ae 
differences on the four social-role concepts. 
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TABLE 2 
ScanED Discriminant Fonction WrrGHTs ror Bors INrrTIAL AND FiNAL Test Data 
Stimulus words vane biota 

I n I II 
I Wish I Were B - -03 00 
High Society —.32 40 = .23 =.15 
Outstanding Citizen .27 05 .18 —.27 
Cultured Person —.07 —.30 .18 —.26 
Community Leader -33 .08 —.48 —.12 
Teacher .15 .02 22 .06 
Doctor —.4l —.26 02 26 
Lawyer —.10 —.06 42 En 
Accountant —.28 —.25 -.12 —.29 
Engineer -77 34 .07 03 
Technician .32 59 .22 95 
Business Executive —.46 —.15 —.82 —24 
Clerk .13 64 .29 4 
Salesman —.08 09 —.44 24 
Bookkeeper —.43 —.27 55 —.01 
Electrician/Plumber —.06 4l .18 05 
Truck Driver/Deliveryman — .34 —.18 -A Raps 
Mechanic/Machinist 26 =l 7.82 vn 
Policeman/Fireman —.06 —.10 —.06 0 È 
Canonical correlations .48 i -87 22 he 
Proportion of total dispersion 54% 28% 50% iti 


tid at the end of the 2-year period 
€ groups differed significantly on two of 
e four concepts (high society and com- 
ai leader). The trends that resulted in 
ese final differences were for the CCLA 
a CCB groups to become more closely 
d with the social roles (especially 
/CB—three of the four changes were sig- 
nificantly different from zero), and for the 
à /À and CCT groups to remain the same 
T increase in discrepancy. 


The ANOVAs involving the 14 occupa- 
tional roles revealed significant group dif- 
ferences on 7 occupations for the initial 
data and 8 for the final data. However, the 
occupations involved were not the same and 
the differences revealed an increase in occu- 
patonal identification for the CCT and 
CCB groups and a decrease for the ULA 
group. 

"Three of the occupations showing signifi- 
cant differences for the initial data were en- 
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TABLE 3 
Group CENTROIDS ON THE First Two 
DISCRIMINANT FUNCTIONS ror BOTH 
INITIAL AND Fina Test Data 


Centroids 
Group 
IFI IF II FFI FFI 
University —.526 .879 | —.537 .427 
COliberalarts| .272 | —.168 .042| —.256 
CC technieal | —.611 | —.865 | —.390 | —1.007 
CC business .688 .064 .791 .142 


Note.—Abbreviations: CC = community col- 
lege; IF = initial function; FF = final function. 


gineer, technician, and electrician/plumber, 
all of which are in the technical cluster and 
most closely identified with by the CCT 
group. The remaining four occupations in 
the initial data were significant due to high 
mean discrepancy scores for the ULA 
group. For the final test data, differences on 
the Scales business executive and salesman 
were significant with the CCB group having 
the lowest means. The remaining significant 
differences for the final data were again due 
to high mean discrepancy scores for the 
ULA group. In general, over the 2-year pe- 
riod, the CC groups increased in overall as 
well as specific occupational identity, while 
the ULA group moved in the opposite direc- 
tion. 
A composite picture of the initial and 
final group differences identified above is 
given by discriminant function analysis. 
There are three distinct discriminant func- 
tions defining the three dimensions in which 
the four groups could differ. Statistics are 
available to examine the usefulness of each 
function. For the initial test data, only the 
chi-square statistic associated with the first 
function was significant beyond p = .05 
Qa? = 48.48, df = 21, p < .001; x? = 
26.61, df = 19, p < .20). Both the first and 
second discriminant functions for the final 
data had associated chi squares significant 
beyond p = .01 (xı? = 48.78, df = 21, p < 
001; x: = 36.23, df = 19, p < 01). In 
both cases, the chi square associated with 
the third function was very small. 
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The discriminant-function weights for the 
first two functions scaled by multiplying 
each of the raw weights by the appropriate 
error standard deviation for each variable 
are given in Table 2. These scaled weights 
indicate by their sizes the relative contribu- 
tion of each variable to discrimination 
among the four groups (Bock & Haggard, 
1968, p. 118). Also given in Table 2 are the 
canonical correlations of the first two func- 
tions with group membership (Cooley & 
Lohnes, 1971) and the proportion of total 
dispersion accounted for by each of the first 
two functions. Note that while the test for 
significance associated with Initial Function 
II did not attain the traditional level of 
significance, the canonical correlation and 
proportion of total dispersion suggest it is of 
substantive interest. 

Centroids for each of the groups for both 
the initial and final data on the first two 
discriminant functions are presented in 
Table 3. These centroids represent the 
group means in the two-dimensional space 
defined by the discriminant functions. The 
centroids are shown graphically in Figure 1. 
We see from Figure 1 that the significant 
differences between the ULA and COB 
groups and the CCT and CCB groups are 
represented by Initial Function I for both 
sets of data. The weights in Table 2 suggest 
that for the initial test, Initial Function I 
represents a somewhat complex business 
versus technieal occupations continuum. 
However, on the final test, Function I has 
become a clearly business dimension. The 
contribution of community leader to Final 
Function I may be related to the common 
concept of the business executive as fulfill- 
ing a leadership role in the American com- 
munity. Thus, Final Function I may be la- 
beled more accurately as a business dimen- 
sion with social connotations. 2x d 

While the ULA and CCT groups differ in 
similar ways from the CCB group along one 
dimension, Figure 1 shows that these ed 
groups are distinct in terms of the s 
dimension of the discriminant space. T 2 
weights on Initial Function II in Tabi 
reflect the fact that the CCT and CCL 
groups identify more closely with the occu- 
pations technician, clerk, and electrician 


Aun lim n 


Initial Final 
^ University 
o 4 CC Technical 
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Initial Final 
u 5 CC Business 
o © CC Liberal Arts 


Function I 


Function II 
Fia. 1. Group centroids on the first two discriminant functions for both the initial and 


final test data. 


Plumber than does the ULA group. On the 
other hand, the ULA group has a lower 
Mean discrepancy on the self-concept varia- 
ble. Thus, Initial Function II identifies 
Group differences along a self-esteem dimen- 
Son of positive self-concept versus negative 
Self-concept and identification with low sta- 
tus Occupations. However, the correspond- 

E final test function, is dominated by the 


technical scale and is clearly based upon 


2 NUMEN 


"Ae occupational identification of the CCT 
Eroup. Thus, the second function lost its ini- 
Mal status connotation over the 2-year pe- 
Nod that was studied. 

, 10 compare further the occupational 


identification of the four groups, the mean 


1 


discrepancy scores for each group on the 14 


occupations were ranked from lowest 
(Rank 1) to highest (Rank 14) for both the 
initial and final tests. Intergroup correla- 
tions as well as correlations with the social 
status ranking of the oceupations (Reiss, 
1961) are given in Table 4. Initially, the 
rankings for the ULA and CCT groups are 
most similar (.80), while the CCLA group 
is least like any of the others. The low cor- 
relations of the CCT and CCLA groups 
with the status list (.54 and .43) are consist- 
ent with the above status interpretation of 
Initial Function II in Table 2. The inter- 
group rank-order correlations for the final 
data are much higher than the initial ones; 
note especially the ULA-CCLA groups’ cor- 
relations. The initial value was .49 and the 
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TABLE 4 
INTERGROUP AND STATUS RANK-ORDER 
CORRELATIONS OF THE MEAN 
Discrepancy SCORES FOR 
Tue FOURTEEN 
OCCUPATIONS 


Group 


1. University 

2. CC liberal arts 
3. CC technical 
4. CC business 


5. Status ranking* 


Note.—Numbers on the diagonal are correla- 
tions between the initial and final mean ranking 
for each group. Numbers above the diagonal are 
the intergroup correlations for initial testing. 
Numbers below the diagonal are the intergroup 
correlations for final testing. Abbreviations: CC 
= community college. 

* Status ranking of 14 occupations by Reiss 
(1961). 

f, = 46 for p < .05. 
Ta = .65for p < .01. 


final value was .85. In addition, the initial 
correlation for CCLA with the status list 
was .43, while for the final test the value 
was .66. 

Finally, the stability of each group's 
mean ranking over the 2-year period was 
determined by correlating the initial and 
final ranks. The results are listed on the 
diagonal of the matrix in Table 4. The 
greatest amount of change occurred in the 
CCB (.50) and the CCLA groups (.66), 
while the ULA (.89) and CCT (.88) groups 
were relatively stable in terms of those oc- 
cupations with which the groups identified 
most closely. 


DISCUSSION 


While there are initial status differences 
among community college and university 
students, the results suggest that commu- 
nity college students do not suffer any detri- 
mental effects due to a lack of prestige. The 
feelings of second-class citizenship often at- 
tributed to such students did not appear to 
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exist. Indeed, relative to their university 
counterparts their self-esteem increased and 
their occupational identification sharpened, 

Specifically, the following outcomes seem 
to occur as the result of the 2-year col- 
lege experience, first, a noticeable increase 
was found in self-esteem for students in the 
community college resulting in comparable 
levels of self-esteem for these students as 
compared to those in the university (this 
enhancement in self-esteem is most striking 
for liberal arts students in the community 
college). Second, there was an increase in 
the status level of occupations identified 
with by the community college liberal arts 
students as compared to the other groups 
and third, there was an increase in the oc- 
cupational focus and identity of technical 
and business students in the community 
college as compared to university and com- 
munity college liberal arts students; specifi- 
eally, technical students identify more with 
technical occupations and less with business 
ones while the reverse is true for business 
students. 

Thus, the 2-year college experience may 
have a dramatic effect. This effect may take 
two forms. First, it may lead to a heighten- 
ing of self-esteem, presumably based on the 
kind of opportunity engendered by the 2- 
year college movement, that is, making col- 
lege accessible to a wider range of students. 
Second, it may lead to an intensification of 
appropriate occupational identification 
among students enrolled in occupationally 
oriented programs (thus, playing a role in 
career development as described by Super). 
The first 2 years of the university experi- 
ence, on the other hand, produced no notice- 
able shift in self-concept in terms of either 
self-esteem or occupational identification. 

While these causal assertions are only 
tentative, the fact that the changes ob- 
served over the 2-year period studied were 
in the predicted direction and that the re- 
sults were consistent with Super’s theory 0 
vocational development makes them quite 
plausible. The increase in differentiation 
among groups along occupational lines m 
the elimination of self-esteem differences 8 
the end of the 2-year period studied were 
consonant with the theoretical expectations. 


"edes 
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The results also lend support to those 
who advocate the community college as & 
viable alternative to the university for 
many students. This may be especially true 
for those students who lack the self-confi- 
dence necessary to succeed in the competi- 
tive university environment. It appears that 
the community college provides the oppor- 
tunity for success that is essential to devel- 
oping self-esteem and realistic occupational 
identification. 
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THE MINICOURSE AS A VEHICLE FOR CHANGING 


TEACHER 


BEHAVIOR: 


A THREE-YEAR FOLLOW-UP 


WALTER R. BORG* 


Far West Laboratory for Educational Research and Development, 
Berkeley, California 


A 3-year follow-up of teachers trained in Minicourse 1 was reported. 
Videotape recordings of each of 24 experimental teachers were made 
before, immediately after, 4 months after, and 39 months after train- 
ing. These recordings were analyzed to compare the level of teacher 


performance on each specific skill 
four checkpoints, An analysis of 


course evaluation the subjects were significantly above their precourse 
level on all 10 behaviors, Comparisons between performance before the 
course and on the 4-month follow-up revealed significant differences on 
nine behaviors. After 39 months, the performance of the subjects 
was still significantly superior to their precourse performance on 8 
of the 10 behaviors that were scored. 


Since March of 1967, the principal task 
of the Teacher Education Program, Far 
West Laboratory for Educational Research 
and Development, has been the develop- 
ment and evaluation of minicourses. The 
minicourse is a self-contained, self-instruc- 
tional package of teacher training materials 
designed to help the teacher master specific 
teaching skills and strategies. Minicourses 
are an extension of the research on micro- 
teaching and of the technical skills of 
teaching that was initiated at Stanford 
University in 1963. 


Previous Research 


Mieroteaching is a process in which the 
trainee practices specific teaching skills by 
teaching a short lesson to a small group of 
pupils. The lesson is usually recorded on 
videotape and evaluated by the trainee. Re- 
search at Stanford (Allen & Fortune, 1966) 
found that interns Spending 10 hours per 


* Requests for reprints should be sent to Walter 
Borg who is now at the Department of Psychol- 
ogy, Utah State University, Logan, Utah 84321. 


covered in Minicourse 1 at each of 
variance revealed that on the post- 


week on microteaching obtained signifi- 
cantly higher ratings on teacher effective- 
ness than did a control group that devoted 
25 hours per week to regular instruction and 
teacher aide experiences. A similar study by 
Kallenbach and Gall (1969) found that in- 
terns trained with microteaching were rated 
as equal in effectiveness to interns who took 
regular student teaching, although only 
one-fifth as much time was devoted to the 
microteaching treatment. ird) 
This article reports a follow-up designe 
to measure the behavior of participating 
teachers 3 years after they had compii 
Minicourse 1. Minicourse 1 was designed : 
help teachers develop nine skills and eau 
guish three undesirable behaviors, all 1e 
lated to the use of the discussion method $ 
teaching. Some measures of pupil per iE 
ance were also obtained. The initial i j 
tion of Minicourse 1, carried out sho a 
after the 48 participating teachers had cou 
pleted the course in November 1967, d 
cated that of the 13 specific teaei oe 
pupil behaviors measured before an jy 
training, 11 showed large and statistic 
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significant improvement between pre- and 
posteourse measures of classroom perform- 
ance (Borg, 1969). A further check of the 
classroom performance of 38 of these teach- 
ers 4 months after completing the course 
showed that teachers had continued to im- 
prove in 3 of 11 skills measured, and had 
not regressed significantly on any skill 
(Borg, Kelley, Langer, & Gall, 1970). 

À recent replication of our initial evalua- 
tion by Mack and Rector involved 52 in- 
service teachers in western New York 
(Borg, 1971). The investigators obtained 
resulis that closely supported the earlier 
findings although they used the commercial 
rather than the field-test version of the 
course. A small-scale evaluation of Mini- 
course 1 reported by Foster (1969) also 
found large differences between the per- 
formance of teachers who took Minicourse 1 
and those who served as a control group. 

In addition to the research on Minicourse 
l, evaluations of four other minicourses 
have generally found large performance 
changes for teachers completing each course 
(Borg, et al., 1970). The results would sug- 
gest that the minicourse instructional model 
18 generally effective as a tool for helping 
teachers to develop specific skills and to 
change classroom behavior patterns. 


METHOD 


The Minicourse Instructional Model 


In taking the minicourse, the trainee first views 
an instructional film in which one to three specific 
teaching skills are described and illustrated with 
examples from various classrooms. This instruc- 
tional film is followed by a model film in which 
the trainee sees a model teacher fitting these skills 
into a regular classroom lesson. As part of viewing 
the model film, the trainee is usually called upon 
to identify each skill as the model teacher uses it. 

ter viewing the instructional and model films, 

e trainee receives further information on the 
Specific skills by studying a teacher handbook. He 
then prepares a microteach lesson designed to give 

m practice in using the skills. Microteach les- 
sons, as used in Minicourse 1, are typically 10-20 
qinutes in length and are taught to a group of 
Tom 4 to 10 pupils taken from the inservice teach- 
ers regular classroom. The lession is recorded on 
Videotape. Immediately after the lesson, the trainee 
replays the recording and evaluates his use of the 
tea ng skills employing evaluation forms pro- 
vided in the teacher handbook. Then, based on his 
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evaluation, he replans the lesson and reteaches it 
to another small group of pupils from his regular 
classroom, again recording the lesson on videotape 
and evaluating it using additional evaluation forms 
provided in the handbook. Thus, the minicourse in- 
structional model contains three main elements: 
(a) the instructional and model films, (b) the mi- 
croteach and reteach lessons, and (c) the videotape 
replay and self-evaluation. 


The Problem 


The problem of this research was to determine 
the level of behavioral change persisting among 
teachers approximately 3 years after they had com- 
pleted Minicourse 1. Research on this course has 
indieated that the significant changes found imme- 
diately after teachers completed the course per- 
sisted with virtually no loss when the same teach- 
ers were again evaluated 4 months later. Until this 
study, however, no research had been conducted 
to determine the long-term effects of minicourse 
training on the performance of teachers. 


Procedure 


The prototype version of Minicourse 1 was field 
tested with a small sample of in-service teachers 
during July and August of 1967. At the completion 
of this field test, the course was revised extensively 
and underwent its second field test in October and 
November of 1967. A total of 48 teachers were in- 
volved in this field test. One week before the start 
of the field test, each teacher was given instruc- 
tions for preparing a 20-minute class-discussion 
lesson. The teacher was advised to select content 
for the lesson out of his current class work. How- 
ever, the characteristics of an effective discussion 
lesson were briefly described and the teacher was 
urged to prepare as effective a lesson of this type 
as he could. He was also advised that representa- 
tives from the Far West Laboratory would visit 
his classroom at a specified date and hour in order 
to make & videotape recording of his class while he 
was teaching this lesson. At the appointed time, 
these recordings were made and formed the basis 
for the precourse evaluation of the teacher's per- 
formance. The course itself required approximately 
75 minutes a day to carry out the training. Near 
the end of the course, the teachers were again given 
instructions to prepare a class discussion lesson 
that was to be recorded during the week after the 
course was completed. These instructions were 
identical to those given prior to the course. Again, 
a 20-minute videotape recording of the teacher 
was made while he conducted the class discussion 
lesson that he had planned. š 

In order to estimate the short-term retention of 
the course by the field-test teachers, a third video- 
tape was collected approximately 4 months after 
the teachers completed the course. Again, these 
videotapes were made under esssentially identical 
conditions to those that maintained during the pre- 
course and immediately postcourse videotaping. 
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TABLE 1 
INTERRATER RELIABILITY COEFFICIENTS FOR MINICOURSE 1 SKILLS 
Minicourse skill Variable measured 
1. Redirecting the same question toseveralpupils | Number of teacher redirections .99 


2. Framing questions that call for longer pupil re- 
sponses 


a. Asking for sets or groups of related facts when 
formulating information level questions 


b. Avoiding Yes or No replies 


8. Framing questions that require the pupil to use 
higher cognitive processes 


4. Prompting 
5. Seeking further clarification and pupil insight 


6. Undesirable behaviors to be reduced or elimi- 
nated 


a. Teacher should not repeat his own questions 
b. Teacher should not answer his own questions 


c. Teacher should not repeat pupil answers 


Number of words in pupil response .99 

Number of Yes, No, or one-word replies — .99 

Proportion of higher order vs. fact ques- .98 
tions 

Number of times teacher uses prompting .98 

Number of times teacher asks for further 98 
clarification 

Number of times teacher repeats own .95 
questions 

Number of times teacher answers own .98 
questions k 

Number of times teacher repeats pupil .96 


answers 


* Reliability of composite score of two raters, corrected using Spearman-Brown formula. 


Thus, by May of 1968, the investigator had col- 
lected three 


determine the 
skills prior to 


were still teaching in the field-test schools and of 
24 agreed to plan a discussion lesson 
be videotaped in their class- 
tooms. The same instructions that had been used 
for the pre- and postcourse evaluations were given 
to teachers in preparation for the 3-year follow-up. 
The 20-minute class discussions were recorded on 
videotape for each of the follow-up teachers dur- 
ing February of 1971, approximately 39 months 
after these teachers had completed the course. 
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The 3-year follow-up videotapes were | 


scored using the same scoring instructions 
and criteria that had been employed in 
scoring the earlier videotapes. Two raters 
independently scored transcripts of each vi- 
deotape. The interrater reliabilities for the 


scores on the Minicourse 1 skills are re- | 


ported in Table 1 along with a brief defini- 
tion of each skill. These reliability coeffi- 
cients compare favorably with those ob- 
tained in scoring the earlier tapes wie? 
ranged from .60 to .98 for the differen! 
skills. Performance of the 24 teachers I 
cluded in the 3-year follow-up is summa 
rized in Table 2.2 L 
One of the 12 skills, calling on both vo 


*Four of these teachers were not evaluated 8 
the 4-month follow-up. Therefore, data on the = 
month follow-up are based on 20 cases while B» 
on the pretape, posttape, and 3-year follow-up 
based on 24 cases. 
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unteers and nonvolunteers, could not be 
scored because of technical problems, that 
is, it was often not possible to determine 
from the videotape recording whether or not 
a given pupil had volunteered. 

Three other skills covered in Minicourse 
1, refocusing, frequency of punitive teacher 
responses to incorrect pupil answers, and 
pausing had not changed between the pre- 
course and postcourse evaluation and there- 
fore, were not scored since it seemed very 
unlikely they would show significant results 
on the 3-year follow-up. 

This left 8 of the original 12 skills that 
were analyzed in the 3-year follow-up. 
Analysis of variance was employed to com- 
pare the performance of the subjects on the 
videotapes made of their teaching perform- 
ance before minicourse training, shortly 
after training, 4 months after training, and 
39 months after training. The first skill re- 
ported (redirection) changed markedly be- 
tween pre- and postevaluations and then re- 
mained stable over the two follow-up peri- 
ods, Redirection is the technique of framing 
à question having several possible answers 
and then directing it to several pupils 
tather than to a single pupil. The mean 
number of redirections made by teachers in 
the 20-minute precourse lesson was 23.75. 
On the immediate postcourse lesson, these 
teachers used redirection an average of 
34.60 times, an increase of about 37% in the 
Use of this specific technique. Teacher use 
of this skill showed a further small gain on 
the 4-month follow-up and then persisted 
with virtually no loss over the subsequent 3 
years. 

Probing describes a class of techniques 
designed to lead the pupil to a more ade- 
quate or complete response. Minicourse 1 
attempts to increase the teacher’s use of 
three probing techniques. These are 
Prompting, in which the teacher gives the 
pupil clues or asks him leading questions; 
further clarification, in which the teacher 
attempts to get the pupil to clarify, elabo- 
Tate or explain his initial response; and re- 
focusing, in which the teacher attempts to 
Set the pupil to relate his initial response to 
other topics that the class has studied. 

Refocusing behavior was virtually nonex- 
‘stent in either the precourse or postcourse 


575 


tapes. In most discussion lessons, opportu- 
nities to use refocusing seem to be limited. 
The failure of the course to develop this 
skill may indicate that the minicourse 
model is not useful in shaping teacher be- 
havior that can be practiced only infre- 
quently in the microteach and reteach les- 
sons. The reader will recall that this behav- 
ior was not scored in the follow-up studies. 

Teacher use of prompting showed a con- 
siderable gain between pre- and postcourse 
measures, but most of the gain was lost by 
the time the 4-month follow-up was con- 
ducted. Teacher use of clarification was 
somewhat more permanent. Although some 
regression occurred between the postcourse 
evaluation and the follow-up studies, the 
mean score after 3 years was still found to 
be significantly higher than the mean pre- 
course performance. 

Let us now consider changes in three neg- 
ative teacher behaviors that the course at- 
tempts to reduce or eliminate. These behav- 
iors are repeating the question, repeating 
the pupil’s answer, and answering one’s own 
questions. Repeating the question is gener- 
ally considered a poor practice since it 
wastes discussion time and encourages pupil 
inattention. Repeating pupil answers is con- 
sidered desirable by some teacher educators 
since it provides a degree of reinforcement 
to the pupil. We consider it undesirable be- 
cause it increases teacher talk and also con- 
ditions pupils to listen to the teacher rather 
than to one another since they can expect 
the pupil’s answer to be repeated by the 
teacher. 

The disadvantages of the teacher answer- 
ing his or her own questions are obvious. If 
carried to an extreme, this behavior results 
in the teacher giving a monologue rather 
than conducting a discussion. The reader 
will note in Table 2 that all three of these 
negative practices were drastically reduced 
after teachers had completed the course. 

Tt will also be noted that these reductions 
held up remarkably well over the 3 years 
following training. These results suggest 
that the minicourse instructional model 
may be particularly effective in helping 
teachers reduce their use of undesirable 
teaching behaviors. à 
V. Another objective of the course is to train 
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teachers to ask questions that call for 
longer pupil responses and avoid questions 
that can be answered by a single word. The 
average word count for pupil responses 
nearly doubled on the postcourse tapes. Al- 
though some regression occurred over the 
two follow-up measures, pupil responses on 
the 8-year follow-up were still significantly 
higher than those elicited by teachers before 
they had taken Minicourse 1. 

The frequency of one-word pupil respon- 
ses, on the other hand, showed a significant 
increase between the 4-month and the 3- 
year follow-up evaluations. This is the only 
variable measured in which the precourse 
mean was significantly more favorable than 
the mean on the 3-year follow-up. It 
seemed possible that grade level could be a 
factor in the number of one-word pupil re- 
sponses. However, a check of the participat- 
ing teachers indicated that 15 were still 
teaching the same grades they taught at the 
time of the initial study. The mean grade 
level for the 24 teachers was virtually un- 
changed (4.95 vs. 4.84) over the 3-year-pe- 
tiod. One-word responses could also be a 
function of the ability level of pupils in- 
Volved. However, since pupil ability data 
were not collected, the safest conclusion is 
that the course failed to bring about long- 
term improvement in the teacher's skill in 
eliciting fewer one-word responses from his 
pupils. 

Finally, the course attempts to increase 
the teacher's use of higher cognitive ques- 
tions, Research (e.g., Floyd, 1960) has dem- 
onstrated that many teacher questions re- 
quire little of the pupil except the recall of 
isolated facts. Our analysis indicated that 
Only 38 percent of the teachers’ precourse 
Questions called for higher cognitive proc- 
esses, On the postcourse tapes, higher cogni- 
tive questions inereased to 5075. 

This percentage remained virtually un- 
changed when measured in the two follow- 
Up evaluations. This reflects a remarkable 
stability in the behavior of the average 
Rider who received the minicourse train- 

g. 

One major objective of the course that 
Telated to several of the specific skills 
taught was to reduce the percentage of time 
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during class discussion when the teacher is 
talking. Previous studies have shown that 
teachers talk as much as 70% of the time 
during class discussions, thereby severely 
restricted the amount of time available for 
pupil contributions (Floyd, 1960; Adams, 
1964). Analysis of Minicourse 1 data re- 
vealed that the average teacher talked 53% 
of the time before the course and only 33% 
after the course. Reducing teacher talk to 
this degree resulted in a profound change in 
the discussion atmosphere on the postcourse 
videotapes. Pupils were generally more in- 
terested and more willing to participate; di- 
rect interactions between pupils were more 
in evidence; and teachers no longer domi- 
nated and restricted the discussion. 

This change persisted virtually undimin- 
ished to the point of the first follow-up. 
However, after 3 years the average teacher 
had regressed significantly, and the propor- 
tion of teacher talk had increased to 45%. 
Apparently, the tendency for teachers to 
talk when they should be listening is a very 
powerful one. 

Limitations 

As is the case for most field studies, it 
was necessary for the investigator to make 
several compromises between cost and rigor 
in this project. The cost of sending a techni- 
cian into the field to make classroom video- 
tapes is such that we were unable to obtain 
follow-up tapes on teachers who were no 
longer teaching in one of the original 12 
field-test schools. A further limitation was 
in the makeup of the initial sample. The 
original 48 field-test teachers were a volun- 
teer group and were not a random sample of 
any teacher population. Thus, the findings 
probably best reflect the level of perform- 
ance change that can be expected when the 
minicourse is used with volunteer teachers. 

In the original Minicourse 1 evaluation 
we decided against a control group because 
we considered the probability of experi- 
enced teachers changing their teaching be- 
havior without intervention was so slight 
that the cost of obtaining precourse and 
posteourse measures on control teachers was 
not justified. Since that time we have em- 
ployed control groups in evaluation of Min- 
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icourses 5, 9, and 15, and none of these con- 
trol groups have gained signfieantly be- 
iween the pre- and postcourse evaluations. 
' It would also have been desirable to ob- 
tain samples of each teacher's performance 
randomly, and at times unknown to the 
teacher. Several problems made this ap- 
proach impractical. We did not have 
enough equipment to leave a videotape re- 
cording system set up in each classroom that 
could have been activated on a random time- 
sampling basis. Since very few classrooms 
contain one-way mirrors or other facilities 
that allow recording the teachers’ behavior 
without their knowledge, random time-sam- 
ple recordings simply were not practical. 

Another problem with random time sam- 
pling is that a much larger sample of 
teacher behavior must be collected in order 
to obtain behavior samples relevant to the 
specific skills covered in a given minicourse. 
In the case of Minicourse 1, for example, 
only class discussion lessons are relevant. 
Since the cost of analyzing videotapes of 
teacher performance is high, one must raise 
the question of whether evaluating 2 hours 
of tape obtained via random time sampling 
is justified as compared with analyzing 20 
minutes of a class discussion lesson in which 
the teacher knows his performance is being 
recorded. 

Since the teacher knew on all four evalu- 
ations that his behavior was being recorded, 
we cannot generalize our findings to situa- 
tions in which the teacher does not know a 
recording is being made. It is possible that 
the teacher has been conditioned to perform 
when the camera is present and does not use 
the minicourse skills when no recording is 
being made. However, although we do not 
yet have research evidence on this question, 
much of our experience in the development 
of minicourses suggests that our recordings 
probably provide a reasonable estimate of 
the teacher’s regular classroom perform- 
ance, For example, we have learned in de- 
veloping our model lessons that it is ex- 
tremely difficult to get a teacher regularly 
to emit specific behaviors that are not part 
of his teaching practice. Even though our 
model lessons are typically designed to pro- 
vide examples of only two or three very 
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simple skills, we find we must work with 
teachers many hours before these skills ap- | 
pear. Our usual procedure includes giving l 
the teacher a precise operational definition 
and examples of each skill, developing with = 
the teacher a detailed lesson plan that pro- | 
vides points at which each skill is to be 
used, and conducting several rehearsals in 
which the teacher sees a videotape replay 
and gets prompt feedback on his perform- 
ance. When we consider that this elaborate 
process is needed to get our model teacher 
to emit 2 or 3 simple behaviors, it seems 
doubtful if many teachers can emit before 
the camera the 12 or so behaviors covered 
in a typical minicourse unless these behav- 
iors have become part of the teacher's regu- 
lar teaching repertoire. | 
Some of the results of the 3-year follow- | 
up also seem relevant to this question. At 
the time of the final recording of teacher 
performance, the participating teachers had 
not been recorded for about 35 months. Yet, 
the teachers were still significantly superior | 
to their precourse performance on seven of 
the eight teacher behaviors measured. Fur- 
thermore, on five of these variables the av- 
erage teacher performance was virtually 
identical to that recorded 35 months earlier 
at the 4-month follow-up. Although possi- 
ble, it seems very unlikely that a teacher 
could fail to practice these skills for 3 
years, and then emit them at the immediate 
postcourse level when a recording is made. 
In the final analysis, all of the limitations 
we have discussed are concerned with 
whether added control would have been jus- | 
tified in view of the substantially higher | 
cost involved. It was our judgement that 
incorporating these controls and safeguards 
would not have changed the overall out- 
comes of the research since very large 
changes in teacher performance were foun 
for most of the Minicourse 1 skills. 
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DIRECTION OF THE EFFECT OF QUESTIONS IN 
PROSE MATERIAL? 


BARRY McGAW aw» ARDEN GROTELUESCHEN? 
University of Illinois 


The direction of the facilitative effect of questions inserted at intervals 
in prose material was examined in terms of the textual distance of 
particular information in the passage from the inserted questions and 
the relationship between the information tested by the inserted ques- 
tions and that tested by the criterion test items. The subjects consisted 
of 140 undergraduate teacher education students obtained as paid vol- 


unteers. Results showed that the 


initial effect of inserted questions 


may be forward, shaping appropriate inspection behaviors. In addition, 


superior performance on 


pages immediately after questions suggested 


a forward effect mediated through increased attentiveness. Superior 


performance on criterion 
the inserted questions, 
of direct transfer, 
resulting from the memory search 


items dealing with the same section of text as 
but constructed so as to exclude the possibility 
suggested a facilitative review effect—the facilitation 


initiated by the inserted questions. 


The insertion of questions in prose mate- 
rial has been shown to facilitate learning 
from the material (Rothkopf, 1966). The 
effects of these questions are both direct, 
facilitating subsequent performance on 
identical questions, and indirect, facilitat- 
ing performance on other questions about 
the text materials. The effect is marked 
when the questions refer to preceding mate- 
rial in the text (postquestions) but may 
even be reversed when they refer to subse- 
quent material (prequestions) (Frase, 1968; 
pon Patrick, & Schumer, 1970; Rothkopf, 
1966). 

Although both direct and indirect effects 
have been reported, the operation of these 
effects has not been clearly explicated. One 
conjecture is that the questions serve to 
shape inspection behaviors thus facilitating 
performance on posttest items dealing with 


*The research reported herein was performed 
pursuant to a grant with the Office of Education, 
United States Department of Health, Education, 
and Welfare. The authors are grateful to E. Z 
Rothkopf and R. C. Anderson for helpful com- 
ments on an earlier draft of this paper. 

* Requests for reprints should be sent to Arden 
Grotelueschen, College of Education, University of 
Illinois, Urbana, Illinois 61801. 
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material following the questions in the text 


(Rothkopf, 1963). That is, the effect of the | 


questions is forward in that they influence 
the inspection of materials that have not 
yet been read. From this point of view 
inspection behaviors are seen to be rein- 


forced (and, thus, maintained) if the in- | 


serted questions can be answered or non- 
reinforced (and, thus, altered) if the in- 


serted questions cannot be answered. A | 


number of studies have strengthened this 
shaping hypothesis. Rothkopf and Bisbicos 
(1967) observed that the facilitation was 
greater toward the end of the text and that 
it was selective, being greatest for items in 
the criterion test that were similar to those 
in the original text. Rothkopf and Coke 
(1968) found that learning was an increas- 
ing function of the likelihood that frag- 
ments of the text were noticed. Frase 
(1969), in demonstrating the effects of ae 
ferent organizations of prose materia, 
showed that learning was determined by the 
aspects of the text to which the learner 
could attend. i 
Such a forward effect need d pe: 
only through shaping appropria 
tion behaviors, causing subjects to attend 
to appropriate features of the text. It may 
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also function by simply controlling the level 
of attentiveness, causing the reader to at- 
tend more carefully to the material follow- 
ing each set of inserted questions. The effect 
of the questions would be cyclic, with the 
effect diminishing as the reader moves 
through the material following the ques- 
tions but being reinstated following the next 
set of questions. An explanation only in 
terms of shaping appropriate inspection be- 
haviors would predict a cumulative im- 
provement rather than a cyclic effect. Al- 
though there are no clear data to support 
this second hypothesis, there is some 
suggestive evidence. Rothkopf and Bloom 
(1970), for example, found that reading 
rate slowed after each set of inserted ques- 
tions, 

The two hypotheses discussed above pos- 
tulate a forward facilitative effect for post- 
questions. However, the experimental re- 
sults to date have not ruled out the possibil- 
ity of the indirect facilitative effect of ques- 
tions being a backward, or review, effect. 
Frase (1968) noted that the superiority of 
Postquestions over prequestions occurred 
even on the first paragraph (though this 
could be attributed to a suppressive effect 
of prequestions). Watts and Anderson 
(1971) found no increase in performance 
toward the end of their material. Bruning 
(1968) demonstrated that there is an addi- 
tive review component in the effect of post- 
questions. 

‘ If the facilitative effect of inserted ques- 
tions is, at least in part, a review effect, 
then the facilitation could be expected to be 
greater for criterion test items that deal 
with material related to that reviewed in 
answering the inserted questions. Rothkopf 
and Bisbicos’ (1967) observation of greater 
facilitation on criterion test items similar to 
Questions inserted in the text could be ac- 
Counted for in this manner. Further, such a 
Teview effect could be expected to be 
stronger with short preceding lags, that is, 
with material read shortly before the in- 
pa questions and weaker with longer 
ags. 

, If the facilitative effect of inserted ques- 
‘ons is, at least in part, a respondent phe- 
Tomenon—with the effect of the questions 
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increasing the attentiveness to material fol- 
lowing the questions—then the effect should 
be greatest with short following lags, that 
is, on material immediately following the 
questions, and weaker with longer lags. 

The purpose of the present study was to 
test for both forward and backward effects 
of inserted questions, and to determine the 
conditions under which each is most effec- 
tive. The conditions examined were the re- 
lationship between the inserted questions 
and the subsequent questions on which fa- 
cilitation was revealed and the textual dis- 
tance of the material tested from the in- 
serted questions. 
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Materials 


The basic material used in the reported experi- 
ments was a selection of material from Rachel Car- 
son's book, The Sea Around Us. The text was 
multilithed onto 21 pages of approximately 260 
words each. From each page of the material three 
questions were developed. All 63 questions were of 
the completion type and required the recall of spe- 
cific information from the text. The questions were 
prepared so that, for each page, two of the ques- 
tions dealt with the same material, while the third 
dealt with an unrelated topic. 

The pairs of questions dealing with the same 
material were developed in such a way that, al- 
though they dealt with the same text material, 
neither could be answered from a knowledge of 
the answer to the other. For example, from the 
following passage in the text a pair of questions 
were formed: 


Then from the surveying ship Bulldog, examin- 
ing a proposed northern route for a cable from 
Faroe to Labrador in 1860, came another report. 
The Bulldog’s sounding line, which at one place 
had been allowed to lie for some time on the 
bottom at a depth of 1260 fathoms, came up with 


13 starfish clinging to it. 
The two questions developed were: 


(a) The surveying ship which recovered starfish 
from a depth of 1260 fathoms in 1860, was 
exploring a route for a cable from Faroe to 


b) The surveying ship ——, which recovered 

s starfish from a depth of 1260 fathoms in 
1860, was exploring a route for a cable from 
Faroe. 


The third question developed from the same page 
as this pair required the name of an Arctic ex- 
dogs y 
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Thus 42 of the 63 questions were matched in 
pairs. Of these 42, 21 (1 chosen at random from 
each pair) were selected for insertion in the text 
material for the various experimental treatments. 
These inserted questions are referred to as EQs. 
The remaining 42 items, 21 matched and 21 un- 
matched with the EQs, were used to form a cri- 
terion test (T1). The 21 EQs were also used in a 
criterion test (T2) given after T1. 

Test T1 was intended to measure the general 
facilitative effect of EQs whereas test T2 was to 
measure specific learning of the material tested by 
EQs. The two tests were bound into a single test 
booklet. 

A general measure of reading ability was ob- 
tained for all subjects by administration of Part II 
(Reading) of the Reading Comprehension Test 
(Form 1A) from the ETS Cooperative English 
Tests. 


Treatments 


Three experimental treatments and two control 
treatments were used. All text materials were 
bound into booklets which were distributed in ran- 
dom order, and all subjects took part in the ex- 
periments at the same time. Because subjects 
could see one another they were told explicitly, in 
the printed instructions at the front of the booklet, 
that although all booklets contained the same 
passages, some were arranged differently from 
others and that they should not be concerned if 
other subjects appeared to be involved in writing, 
for example, when they were not. The instructions 
to all subjects were to “study each page of the 
chapter carefully, paying close attention to facts 
and figures and to names and dates.” 

The treatments, summarized in Figure 1, were 
as follows: 

Experimental Group 1 (E1): After the first six 
pages, that is, after Sections A and B at Position 
ab, a sheet was bound into the text booklet with 
the six EQs from those pages. Similar sheets, each 
with three EQs from the preceding three pages, 


TEXT SEQUENCE: 
Sections A-G r 
CH RENEA pne a e E e Eo Ee Ee 
t t 1 t 
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were bound into the text booklet after the ninth, ' 
twelfth, and fifteenth pages, that is, at Positions 
c, d, and e, respectively. Thus, inserted questions 

always referred to preceding material. l 

Ezperimental Group 2 (E2): Questions were 
inserted in the text at Positions ab, d, and f ina 
manner similar to that for El. 

Experimental Group 3 (£3): Questions were in- 
serted in the text at Positions c, e, and g. 

Control Group (C): No insertions were included 
in the text booklets. After the instructions, sub- 
jects simply read the 21 pages of text. 

Control Group with Questions (CQ): At the 
same points in the text as for the El group, a cor- 
responding number of irrelevant questions (CQs) 
from a Personal Opinion Scale, adapted from a 
dogmatism scale (Rokeach, 1960), were introduced, 
These questions required approximately the same | 
time to complete as the EQ items. The purpose of 
this treatment was to determine the effect of pro- 
viding break points in the reading without text- 
related questions. 


Subjects 


Subjects for these experiments consisted of 140 
uates, obtained as paid volunteers, at 
Concordia Teachers College, River Forest, Illinois. 
Twenty-eight subjects were assigned at random to | 
each of the treatment groups. 


Procedure 


All subjects attended a single group session. 
The reading comprehension test, requiring 25 min- 
utes, was administered first. The experimental 
booklets were then distributed to subjects who 
worked through them at their own pace. When a 
subject completed his booklet, he indicated this 
to a monitor who removed it and provided him 
with a test booklet containing both Tests Tl and 
T2. Subjects were allowed to leave the room when 
they had completed both tests. The entire proce- 
dure required about 90 minutes for the slower sub- 
jects. 


Insert 
Group ab 
Et GEQs 
} TRE AMENT EXP. A. € po 
i INSERTS ; ER 6cas 
EXPERIMENTS E2 6EQs 


be. 


Fic. 1. Text booklet sequence for different treatment groups in Experiments A and B. 
i 


t 
Insert Insert Insert Insert Inser 
c d e f 9 

3EQs 3EQs 3EQs — SST 
--— m 
3CQs 3cQs 3cas =- fo 
FE 3EQs --— 3EQs er 
E: 

3EQs -- 3EQs MET. j 


DIRECTION OF QUESTIONS IN PROSE MATERIAL 


RESULTS AND DISCUSSION 


The treatments used in this study form 
two separate experiments. Groups El, C 
and CQ comprise one experiment (Experi- 
ment A) and Groups E2 and E3 the other 
(Experiment B). The results for these are 
presented separately. 


Experiment A 


Facilitative effect of inserted questions. 
Mean scores, raw and adjusted, on the two 
criterion tests, T1 and T2, are shown in 
Table 1. The adjustment resulted from the 
use of reading comprehension test score as a 
covariate. (Test T2 actually contained 21 
items but, for this analysis, only those 15 
inserted in the text for Group El were 
used.) For Test T1, the differences among 
the adjusted means were significant (F = 
353, df = 2/80, p < .04). 

The difference between the two control 
groups was not significant. Thus, there was 
no support for the hypothesis that the facil- 
itative effect of inserted questions was due 
to increased attentiveness following a rest 
from reading. That is, the effect apparently 
occurs only with the insertion of text-re- 
lated questions such as the EQs. 

On Test T2, the performance of Group 
El was similarly superior to that of the 
control groups (F = 25.86, df = 2/80, p < 
001). Again the difference between the con- 
trol groups was not significant. Control 
Group CQ was, therefore, dropped for sub- 
Sequent analyses. 

Forward or backward effect. Groups El 
and C were compared on the 12 items in Tl 
from Sections A and B (the first six pages, 
tead before any questions were encoun- 
tered) and the 12 items from Sections F and 


TABLE 1 
Meran Scores on CRITERION TESTS BY 
GROUPS ron EXPERIMENT A 
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TABLE 2 


Apsustep Means on AB anp FG ITEMS IN 
Test Ti sy GROUPS ror EXPERIMENT A 


Ttems 


Group 
AB FG Overall 
El 2.05 4.86 3.46 
Cc 1.80 4.31 3.05 
Overall 1.93 4.59 3.25 


G (read after the last questions had been 
encountered). A forward effect would be ex- 
pected to produce superiority of E1 on the 
FG items while a backward effect should 
produce superiority of El on the AB items, 
The adjusted means are given in Table 2. 

The difference between the groups was 
not significant (F = 1.72, df = 1/53, p > 
8). Although the overall difference be- 
tween scores on AB and FG was significant 
(F = 98.15, df = 1/54, p < .001), it is 
of little substantive importance since no at- 
tempt was made to control for differences in 
item difficulty. The substantively important 
Groups x Sections (AB, FG) interaction 
was not significant (F = .11, df = 1/54). 

There are theoretical grounds, however, 
for believing that forward and backward 
effects might operate in a more limited 
fashion than could be revealed in such a 
gross analysis as that shown above. A re- 
view effect should be greater for criterion 
items matched with the inserted questions 
and, in particular, for matched items test- 
ing material from pages that immediately 
preceded the inserted questions. An atten- 
tional increase following inserted questions 
would, on the other hand, exert the greatest 
effect on material from pages immediately 
following the inserted questions—an effect 
that should be revealed by both matched 
and unmatched items. 

In order to test these hyoptheses, per- 
formance on all criterion test (T1) ques- 
tions dealing with material from pages im- 
mediately before and immediately after in- 
serted questions was considered. Pages 6, 9, 
12, and 15 were those “before” insertions, 
and pages 7, 10, 13, and 16 were those 
“after” insertions. From each of these pages 
(there were two questions on Test T1, one 


TABLE 3 
ApjUSTED MEANS ON MATCHED AND 
UNMATCHED ITEMS BEFORE AND 
AFTER INSERTS BY GROUPS 
FOR EXPERIMENT À 


Group 


El 
Cc 
Overall 


matched and one unmatched with one of 
the questions on the adjacent insert. Mean 
performances for the groups are shown in 
Table 3. 

Analysis of covariance (Table 4) re- 
vealed Group E1 to have been significantly 
superior to Group C in overall performance 
(F = 4.02, df = 1/53, p < .05). The signifi- 
cant effect for position does not demon- 
strate that performance on items from 
pages after questions was superior. Since 
the Position x Groups interaction effect was 
not significant, the position effect was due 
only to differences in item difficulty. The 
same is true for the significant effects for 
item type and Position x Item Type inter- 
action, 

The important effect in this analysis is 
the Item Type x Position x Groups inter- 
action effect, Although the test of this effect 
fell just above the conventional level of sig- 


BARRY McGAW AND ARDEN GROTELUESCHEN 


nificance (F = 3.74, df = 1/54, p < 955), 
it is examined here in some detail since a 
supplementary analysis of the data in Ex. 
periment B, reported later, showed the ef. 
fect to have been replicated there. This Te- 
sult indicates that the Groups x Position of 
Items interaction for matched items was 
significantly different from that for un- 
matched items. The effect is shown in Fig- 
ure 2, from which the trend in the data can 
clearly be seen. Tests of simple main effects 
(Winer, 1962, p. 323) showed that El was 
significantly superior to C on matched 
items from pages immediately before in- 
serted questions (F = 4.96, df = 1/215, p 
< .05) and on unmatched items from pages 
immediately after inserted questions (F = 
4.30, df = 1/215, p < .05). 

For the unmatched items there is little 
likelihood of a review effect operating. The 
results observed with unmatched items can 
readily be accounted for in terms of a for- 
ward effect. After each set of inserted ques- 
tions subjects appeared to attend more care- 
fully to the text, hence the superiority of El 
over C on items from the first page after each 
insert. On continuing to read the attentive- 
ness to the text presumably diminished with 
a consequent drop in the relative level of 
performance of E1. Such a pattern can be 
seen in the data when it is recognized that 
pages referred to as "before" inserts were 
three pages after the prior insert. For the 
unmatched items it seems more useful to 


TABLE 4 
ANALYSIS OF ÜOVARIANCE FOR MATCHED AND UNMATCHED ITEMS BEFORE AND AFTER POSITION 
or INsERTS FOR EXPERIMENT A 


Source SS df MS F ? 
Between subjects 
Groups (G) 7.44 1 7.44 4.02 047 
Subjects within groups (SG) 98.05 53 1.85 
Within subjects 
Position of items (P) 12.07 1 12.07 14.67 .000 
xG .00 1 .00 .00 
PXSG 44.42 54 .82 1 
Type of item (I) 7.87 1 7.87 13.89 00 
IXG .01 1 .01 -03 pu 
Ix SG 30.60 54 -56 5 
IXP 2.16 1 2.16 3.74 -05 
IXPXG 2.16 1 2.16 3.74 -055 
IXPxSG 31.17 54 57 
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o a 


MEAN NUMBER OF ITEMS CORRECT 
a 


AFTER BEFORE 


(LONG BEFORE) 
MATCHED ITEMS 


(SHORTLY BEFORE) 
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BEFORE 
(LONG AFTER) 


AFTER 
(SHORTLY AFTER) 


UNMATCHED ITEMS 


Fic. 2. Groups X Position of Item interaction for different types of items for Experi- 


ment A. 


distinguish the page positions as “shortly 
after” and “long after” rather than “after” 
and “before.” Group El was superior to 
Group C on questions from pages shortly 
after inserts, but the effect was attenuated 
with increasing textual distance from the 
Insert—the difference between the groups on 
the “long after” pages being insignificant. 
Such a forward effect should also operate 
with the matched items but, with these 
items, there is the additional possibility of a 
Teview effect facilitating performance on 
Material from pages prior to the inserts. 
The data for the matched items suggest 
that such a review effect did, in fact, oper- 
ate. Group E1 was significantly superior to 
C on matched items from pages immedi- 
ately prior to inserts but not on pages after 
e inserts. These “after” pages, for consid- 
eration of a review effect, are better re- 
ferred to as “long before” inserts. Just as 
the facilitative forward effect on the un- 
matched items was attenuated with in- 
creased textual distance after the insert, the 
facilitative review effect on matched items 


Was attenuated with increased textual dis- j 


tance before the inserts. 


Experiment B 

The experiment with Groups E2 and E3 
was designed to provide a further test of the 
alternative forward and backward hy- 
potheses. The design of this experiment can 
be seen in Figure 1. For Group E2, inserted 
questions occurred before Sections CO, E, 
and G whereas, for Group E3, they occurred 
after these sections. : 

Forward or backward effect. The ad- 
justed mean performances of the two 
groups, on the items in Test T1 which were 
drawn from the pages in C, E, and G, are 
shown in Table 5. The data from which 
these means were obtained were analyzed in 
a repeated measures ANOVA with covari- 
ance adjustments on the between-subjects 
variable. This analysis showed that the 


TABLE 5 


‘Apsustep Means on Sections C, E, AND G 
sy GROUPS ror EXPERIMENT B 


Group Section C Section E Section G 
E2 2.27 3.22 2.33 
E3 1.75 3.84 2.16 


TABLE 6 
ApjusTED MEANS ON MATCHED AND 
UNMATCHED ITEMS BEFORE: AND 
AFTER INsERTS BY GROUPS 
For EXPERIMENT B 


Before After 


Page group [Group age rag LODS. 
v n- 
Matched matched Matched matched 


A E2 


1.40 | 1.31 | 1.38 | 1.63 
(6, 7, 12, 
18,18,19) | O | 1.09 | 1.83 | 1.43 | 1.18 
B 3| 1.28 | 1.20 | .72 | .90 
(9, 10, 15, 
16, 21) [o 92 | 1.43 | .44 | .95 


overall difference between the groups was 
not significant (F = .013, df = 1/53). The 
Groups x Sections interaction effect, how- 
ever, was significant (F = 6.81, df = 2/108, 
p < .002). 

Tests of the simple main effects (Winer, 
1962, p. 311) for groups, for each section of 
the material, revealed that the superiority 
of E2 on Section C approached significance 
(F = 3.1, df = 1/161, p < .10), that the 
superiority of E3 on Section E was signifi- 
cant (F = 44, df = 1/161, p < .05), and 
that there was no significant difference be- 
tween the groups on Section G. 

These data provide important informa- 
tion about the nature of the forward and 
backward effects. On Section C, prior to 
which only Group E2 had received inserted 
questions, the performance of E2 was supe- 
rior. The inspection behaviors of subjects in 
E2 had apparently been shaped on seeing the 
earlier questions, so that they attended to 
the specific factual information tested by 
the inserted questions and the criterion test. 
The extent to which inserted questions can 
have a facilitative shaping effect depends 
on the extent to which subjects’ habitual in- 
spection behaviors are inappropriate for the 
particular text material. Whether the effect 
would be less marked with questions of a 
different type from those used in the present 
study largely remains to be shown. Watts 
and Anderson (1971), in fact, reported data 
suggestive of a review effect rather than a 
shaping effect with “application” questions. 


Prior to Section E, on which the perform-. 
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ance of E3 was superior, both groups had 
encountered inserted questions. This superi- 
ority was obtained despite the fact that. 
Group E2 had encountered two sets of in- 
serted questions, including one immediately 
prior to the section. The superiority of E3. 
can, therefore, be attributed to a review ef- 
fect, occurring because of the inserted ques- 
tions immediately after the section. Thus, it 
appears that inserted questions may serve a 
review function only if appropriate inspec- 
tion behaviors have been used. 


The nonsignificant difference on the final | 


section (G) cannot be accounted for, but an 
explanation might be due to a recency effect 


obliterating the advantage to E3 of having 


questions at the end of the text immediately. 
prior to taking the criterion test. 
Replication of results of Experiment A. 
The primary analysis for Experiment B, re- 
ported above, indieated that both shaping 
and review effects occur. The data in Ex- 
periment A suggested that the review effecti 


was greatest with matched items. A further | 


analysis was made of the data from Experi- 
ment B to determine whether the effect on 
matched and unmatched questions had been 
replicated. In order to do this the data from 
Control Group C in Experiment A was in- 
cluded. / 

For Group E2 the pages immediately be- 
fore inserts were 6, 12, and 18, and those 
after were 7, 13, and 19. These constitute 
Page Group A in Table 6 in which the ad- 
justed mean performances of the groups are 
shown. Page Group B is the set of pages 
before (9, 15, 21) and after (10, 16) for 
Group E3. ia 

The analysis for Groups E2 and C, wit 
data from Page Group A, produced pre- 
cisely the results noted in Experiment 4, 
namely, a significant Groups X Page To 
tion x Item Type interaction (F = 6.07, 
= 1/54, p < .02). The graph of this iner 
tion in Figure 3 shows the effect to be vd 
same as before, superior performance pir 
group with inserted questions on mate! 2. 
items from pages immediately prior to pi 
serted questions (F — 2.1, df — 1/215, en n 
.15), and on unmatched items immediat 1y 
after inserted questions (F = 44, df = 
215, p < .05). The analysis for Groups 
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BEFORE 
(SHORTLY BEFORE) 


AFTER 
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BEFORE 
(LONG AFTER) 


AFTER 
(SHORTLY AFTER) 


UNMATCHED ITEMS 


Fic. 3. Groups X Position of Item interaction for different types of items for Experiment B. 


and C revealed a nonsignificant Groups X 
Page Position x Item Type interaction (F 
= 07, df = 1/54) but, in this case, the 
Groups x Item Type interaction was sig- 
nificant (F = 3.94, df = 1/54, p < .05). 
Group E3 was significantly superior on 
matched items regardless of distance from 
the insert but there was no significant dif- 
ference on unmatched items. 


CONCLUSIONS 


Previous research on the effect of inserted 
questions has shown both direct and indi- 
tect facilitation of subsequent performance 
on questions relating to the text. The indi- 
rect facilitation has occurred on items to 
Which, it had been demonstrated, there was 
no direct transfer model in which knowl- 
edge of the answer to one question facili- 
lates answering another. In this study the 
matched questions were constructed in such 
à way that the answer to one could noi in 
any way provide the answer to the other 
member of the pair. Yet review of the mate- 
Nal required to answer an inserted question 
facilitated performance on its matched 
item, provided that the review occurred 


Within a page or so of the relevant material. ; 


The nature of the retrieval phenomenon, 
or memory search, which gives rise to the 
facilitation is not yet clear. The important 
variable could be similarity of subject mat- 
ter though, in this study, the similarity oc- 
curred only in the subject matter of the sen- 
tence from which the responses were de- 
leted, not in the responses themselves. Al- 
ternatively, the important variable could be 
proximity of material in the text, with 
greater facilitation for contiguous material, 
‘A further possibility is that verbatim recall 
of the original sentence to complete the in- 
serted question provided also the word re- 
quired to complete the matched question, 
despite the fact it was also deleted from the 
inserted question. Not all of the questions 
required simple verbatim recall but further 
research with controlled use of verbatim 
and paraphrase items will clarify this issue. 

This study also confirmed the existence of 
a forward effect on the unmatched items. 
The results of Experiment B provide sup- 
port for a shaping hypothesis, suggesting 
that, insofar as subjects habitual inspection 
behaviors are inappropriate, inserted ques- 
tions will serve to shape appropriate behav- 
jors. The use of questions testing highly 


t 
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specific factual information probably high- 
lighted this effect. The results of Experi- 
ment A suggested that, in addition to a 
shaping effect, the inserted questions serve 
also to control general attentional behav- 
iors. Subjects performed better on material 
from pages immediately after the inserts. 
Such control of attention, however, was only 
achieved with text related questions. 
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Children's imitation of a model's 
and use of present, imperfect, or future tense verbs was studied in 
third and fourth graders. No extrinsic incentives were offered for 
emulating the model, and the response parameters were shown to be 
relatively independent. By comparison with a no-model control group, 
significant imitative changes in all response measures resulted from 
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sentence structure, word content, 


observing the model and, in some treatments, these effects were fully 
maintained with new stimuli. Giving a tense prompt to some groups 
primarily augmented imitation of the most rare, future tense construc- 


tion. 


A number of writers have recently argued 
that imitative learning cannot account for 
children’s rapid adoption of the essentials of 
a language (Ervin, 1964; Menyuk, 1964; 
Miller, 1965; Slobin, 1968). All of these au- 
thors take the basic position that language 
is governed by sets of rules that enable a 
person to produce novel utterances that are 
grammatical in the sense of conforming to 
the specified rules. The prevailing tendency 
of psycholinguists to accept language be- 
havior as imitative only if reproducing a 
model’s words, phrases, or sentences has 
placed severe, unnecessary constraint on the 
concept of imitation. 

In numerous studies, exposure to modeled 
displays has resulted in the adoption of the 
model's moral judgments (Bandura & 
McDonald, 1963), his putative emotional 
reactions (Bandura & Rosenthal, 1966), his 
preference for immediate or delayed gratifi- 
cations (Bandura & Mischel, 1965), his 
standards for self-reinforcement (Bandura, 


. ..*À portion of this study was described at the 
1969 meeting of the Society for Research in Child 
Development under the title “Socially-induced 
imitation of grammatical structures." We gratefully 
acknowledge the splendid cooperation of Jacque 
Farnum, Principal, and his teachers at Menlo 
Park School, the administrative generosity of 
Tueson School District 1, and we wish to thank 

l ‘Albert Bandura for his advice on the manuscript. 

Requests for reprints should be sent to W. R. 

Carroll, Department of Psychology, University of 

Arizona, Tucson, Arizona 85721. 


Grusec, & Menlove, 1967), and even chil- 
dren’s willingness to offer donations to char- 
ity has been observationally modified (Ro- 
senhan & White, 1967). In these experi- 
ments, the effect of social learning variables 
was not confined to simple mimicry of the 
model's responses. 

The psycholinguists’ concept of rule-gen- 
erated behavior and a social learning con- 
ception of imitative processes clearly are 
not incompatible: No social learning tenet 
excludes rules or generic response para- 
digms from what may be transmitted obser- 
vationally. The important distinction be- 
tween competence and performance (Chom- 
sky, 1957) in language development has its 
counterpart in Bandura’s (1965) distinction 
between acquisition and performance. The 
child who observes a model learns novel re- 
sponse-selection preferences, or patterns of 
responding, that may or may not be ex- 
pressed in his overt behavior. By systemati- 
cally varying the grammatical parameters 
of modeled language, the instructions given 
the observer, and the reinforcement conse- 
quences to model and observer it should be 
possible to investigate further the conditions 
that influence the use or acquisition of rules 
in natural language. 

Two recent studies have been addressed 
specifically to modeling effects in changing 
language behavior in young children. Ban- 
dura and Harris (1966) found that expo- 

sure to a model who produced passive and 
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prepositional constructions significantly in- 
ereased use of these forms as compared with 
a no-model control group. Odom, Liebert, 
and Hill (1968) then replicated this finding. 
The extremely low frequency of exact mimi- 
cry that was obtained further suggested 
that children were abstracting a rule for 
generating phrases, and were discriminating 
a specific phrase form (i.e., preposition- 
article-noun) as an instance of this rule. 

In the present experiments, abstraction or 
induction of the formal rule governing a 
model’s sentences was studied. Economi- 
cally disadvantaged, Mexican-American 
children served as subjects, and their adop- 
tion of the model’s sentence structure, verb 
tenses, and word content were investigated, 
as was the maintenance of imitative lan- 
guage use with new stimuli. The effect of 
brief prompts, as cues to encourage produc- 
tion of particular tenses, was also examined. 
It was hypothesized that, relative to the 
changes from base-line responding of no- 
model controls, children observing the 
model would subsequently adopt her lan- 
guage usage on all response parameters. 

Maraon 

Subjects 

..From a school located in an economically de- 
pressed, Mexican-American community in Tucson, 
80 children were randomly drawn from third- and 
fourth-grade classes. Equal numbers of boys and 
girls and comparable proportions from each grade 
were assigned to all experimental and control 


groups, | whose IQ means were equated with 
Thorndike-Lorge test data provided by the school. 


Materials and Procedure 


wale alekoa mew noted, the methodol- 
: same in three experiments. The 
child was taken individually from class to an ex- 
perimental trailer at the school, and was first 
shown a set of 12 pictures selected from the Golden 
Stamp Book series) mounted on cards. In the guise 
of a sentence game, he was asked by the adult, 
male experimenter to make up a sentence as each 
picture was presented. Following this base-line 
phase, training was conducted by having experi- 
mental subjects observe an adult, female model 
construct sentences to the same 12 pictures, with 
instructions to watch and listen carefully to her, 
and to try and figure out in what way her sentences 
were the same; children were not directed to emu- 
late her behavior. The model’s sentences in all 
variations followed a paradigm of noun subject, 
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auxiliary plus transitive. main verb, and noun 
object. In the present variation, the model used 
the present progressive tense (e.g. “The boy is 
pushing the boat.") ; in the past variation, she used 
the imperfect tense (e.g "was pushing”); in the 
future variation, she used the simple future tense x 
(eg. “will push"). Except for differences in tense of 
verb, all sentences modeled were identical across 
variations. 

After modeling, (which was omitted for control 
groups), the children again constructed sentences 
to the original set of pictures in the imitation 
phase. Then, the generalization phase was intro- 
duced using a new set of 12 pictures, and simply 
repeating the base-line instructions to all children, 
After each child had given sentences to the new 
pictures, he was thanked and returned to class. His 
sentences productions were tape-recorded and later 
typed for scoring. 


Tense Prompt and Design 


Following the model’s performance, half the 
children in each experimental tense variation were 
prompted by the experimenter, before their imi- 
tation phase responding, in accordance with the 
tense the model had displayed. In the present 
variation, the child was asked to tell “what is 

ing in each picture”; in the past variation, 
“what was happening”; and in the future variation, 
"what will happen.” The instructions to un- 
prompted children omitted the tense cue but 
were otherwise the same (ie. as in base line) for 
all children. No tense prompts were given to any 
child before the generalization phase pictures. To 
further assessitthe effect of prompting, independent 
of modeling, in the future variation a second no- 
model control group received the future tense cue 
(“Tell me what will happen in each picture”) be- 
fore their imitation phase responding. With restric- 
tions to assure comparability of sex, grade, and 
IQ means, 10 children were randomly assigned to 
each experimental and control group. 

Three dependent measures were studied: tense 
and sentence structure were scored for base-line, 
imitation, and generalization phases, and word 
content was scored for base line and imitation only 
(since the pictorial content of the new generaliza- 
tioi stimuli precluded use of the same nouns as 
before). In theypresent and past experiments, the 
conditions described were analyzed, separately for 
each response parameter, by comparing the control, 
modeling no prompt, and modeling with prompt 
groups across those phases scored for the measure. 
In the future variation, a 2 (Modeling versus No 
Modeling) X 2 (Prompt versus No Prompt) X 
Repeated Phases design was used to analyze each 
dependent measure. Orthogonal comparisons were 
then made to clarify interaction effects in all 


experiments. 

Scoring of Responses f 
In each variation, the parameters of UR 

structure, content, and tense were scored. For eac k 


utterance, the structure measure required a sub- 
ject noun, plus any auxiliary (permitting an infini- 
tive complement if present), plus & main verb, 
plus an object noun for credit. The content and 
verb tense were not considered so long as the gram- 
matical pattern was met, afid no partial credit was 
given; thus, in any phase, structure scores could 
range from 0 to 12 (a correct utterance for each 
picture). The word-content measure (scored only 
for base-line and imitation phases) assigned one 
point to each of the model’s nouns, and one point 
for her verb (whatever its tense). It was thus 
possible to earn 3 points per picture, with a maxi- 
mum of 36 per phase, for content. The tense 
measure required, for each utterance, both the 
modeled auxiliary (is, was, or will) and the 
modeled main verb form (e.g. pushing or push), 
depending upon variation. So long as the tense 
construction matched that modeled, the child could 
use any verb, regardless of content. Correct re- 
sponse earned 1 point per utterance, with a maxi- 
mum of 12 points possible in any phase. The tense 
criteria thus differed among the separate experi- 
mental variations, depending on which tense para- 
digm had been observed, but the logic of scoring 
was the same. The general control group's re- 
sponses were scored according to all (present, past, 
and future) tense criteria for separate comparison 
vith the respective modeling variations. In the rare 
case of multiple-, complex-, or compound-sentence 
utterances, the first sentence or independent clause 
. Was scored. The scoring system proved reliable 
with agreement between the two independent 
Scorers, who each scored all responses, exceeding 
97% of discrete instances (ie. on each of the 12 
responses per parameter, before these were added 
for the child's total scores per phase); the few in- 
stances of scoring difficulty were resolved by dis- 
cussion among the writers. 
It will be seen that scoring rules were intended 
. to distinguish the tense, structural, and word-imi- 
tation components of response. Thus, if the model's 
Sentence were “The hunter will aim the bow.", 
a child could earn a full credit tense score for any 
Simple future (auxiliary plus main) verb, for ex- 
ample, “will capture.” Full sentence-structure 
credit, in turn, was given any sentence that 
matched the model’s paradigm, for example, “The 
man is throwing the arrow.” Similarly, full credit 
content scores were assigned to any utterance that 
included the model’s subject, verb, and object, for 
example, “The hunter and the warrior aiming and 
Pointing the arrow and the bow." * 


RESULTS 


One-way analyses of variance were ap- 
plied to the base-line scores for the several 


* By an earlier, more complex scoring system (in 
hich partial eredit was given for progressive ap- 
Proximations to the model's rubrics), results and 
Conclusions very similar to those reported here 
Were obtained. 
1 
F 
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groups and response parameters within each 
experiment. All these base-line comparisons 
proved nonsignificant* (largest F = 1.96, df 
= 3/36, p > .13), indicating initial compar- 
ability among treatment conditions through- 
out, as can be seen in Table 1 which presents 
the group means of each response parameter 
by phase, for all the experimental variations. 


Experiment I: Present Variation 


For each response measure, the overall 
analyses of variance are summarized in 
Table 2, from which it can be seen that 
differences among groups, across phases, 
and Groups X Phases interaction were 
found throughout. 

Orthogonal comparisons were performed 
to clarify the meaning of the Groups x 
Phases interactions for each response meas- 
ure (in all experiments). Relative to the 
control group, the pooled modeling subjects 
increased significantly from base line to the 
later phases on all response parameters 
(smallest F = 9.26, df = 1/54, p < .01), 
with neither significant declines from imita- 
tion to generalization phases, nor any dif- 
ferences between the prompted and un- 
prompted modeling groups. 


Experiment II: Past Variation 

For each response measure, the overall 
analyses of variance are summarized in 
Table 3, from which can be seen a pattern 
of results very similar to that of Experi- 
ment I, but somewhat stronger in degree. 

By orthogonal comparisons, the pooled 
modeling subjects again increased signifi- 
cantly more than did controls from base 
line to the later phases on all response 
measures (smallest F = 10.86, df = 1/54, p 
< .01), but, on the tense measure only, 
there was a decline from imitation to gener- 
alization phases (F = 4.96, df = 1/54, p < 
.05). Prompted and nonprompted modeling 
groups again failed to differ on any measure 
and, in general, the interaction effects in 
both Experiments I and II were the result 
of the changes, from comparable base-line 
group means, as the modeling subjects in- 


* All significance levels reported in this paper 
were based on two-tailed probability estimates. 
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TABLE 1 


PuasE MEANS BY Group AND RESPONSE PARAMETER 
FOR ALL EXPERIMENTAL VARIATIONS 


Sentence structure Content Tense 
Group LI aan ee 
Base line | Imitation | General- | Baseline | Imitation | Base line | Imitation Gaal. 
Present variation 
Model no prompt 3.00 7.20 6.30 8.50 14.60 4.90 8.50 9.10 
Model with prompt 2.40 6.80 4.30 8.30 13.50 4,20 10.20 7.30 
Control no prompt 1.50 1.70 1.90 8.60 8.30 4.20 3.80 4.20 | 
Past variation 
Model no prompt 1.80 7.70 5.20 8.80 16.40 .80 4.40 3.00 
Model with prompt 1.30 7.50 7.10 8.30 14.90 1.00 7.60 4,30 
Control no prompt 1.50 1.70 1.90 8.60 8.30 .00 .20 .60. 
Future variation 
Model no prompt 3.10 7.90 5.60 9.40 16.40 .00 1.40 .20 
Model with prompt 2.60 | 5.90 5.00 8.00 12.50 .00 6.10 .50 
Control no prompt 1.50 | 1.70 | 1.90 | 8.60 8.30 .10 .10 +00 
Control with prompt 3.00 | 2.70 | 4.10 | 9.40 8.40 .00 -70 -00 


creased their scores on all measures after across-phases change, observing or not ob- 
observational training, serving the model, and Modeling x Phases 
: " boy interaction. 

) Experiment LIT: Future Variation Prompting the child increased production 
For each treatment and Tesponse meas- of future tense verbs, and interacted with 

ure, the overall analyses of variance are modeling so that prompted-modeling groups 
summarized by Table 4, in which can be appeared to derive more benefit than 
seen significant effects on all measures for prompted controls. On the tense measure, 


TABLE 2 
SUMMARY OF ANALYSES OF VARIANCE FOR PRESENT VARIATION 
Response parameter 
Effect analyzed. Sentence structure Content Tense 
Pee | 
df F df F af F 
Between groups 2/27 11.64*** 2/27 5.18* 2/21 3.32* 
Across phases 2/54 13.88*** 1 29 3g*e* es 9.84*** 
Groups X Phases 4/54 3.34* 2/27 6.66** 4/54 3.97** 
"s 5: 
**p < 01. 


*** 5 < 001. 
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TABLE 3 
Summary OF ÁNALYSES OF VARIANCE FOR PAST VARIATION 
Response parameter 
Effect analyzed Sentence structure Content Tense 
df F df F df F 

Between groups 2/27 9.46*** 2/21 10.87*** 2/27 5.87* 
Across phases 2/54 45.59*** 1/27 32.88*** 2/54 17:01% 
Groups X Phases 4/54 11.32*** 2/27 9.45*** 4/54 5.09** 

*p < .05. 

*y < 01. 
»*p < 001. 


the effect of prompting, and the pattern of 
greater benefit for prompted-modeling 
groups, proved to be mainly related to the 
imitation phase, thus, significant Prompting 
X Phases, and Prompting X Modeling X 
Phases interactions (smaller F = 5.77, df = 
2/72, p < .01) were also obtained, as re- 
flected by the means in Table 1. On the 
content measure, however, from the signifi- 
cant Modeling X Prompting interaction, it 
appeared that modeling groups were more 
adversely affected by prompting than were 
controls, 

By orthogonal comparisons, it was found 
that, relative to the controls, the pooled mo- 
deling groups had increased their scores 
from base line to the later phases on each 
response parameter (smallest F = 10.93, df 


= 1/72, p < .01), but there were significant 
declines from imitation to generalization 
phases on both the tense and structure 
measures (smaller F = 10.53, df = 1/72, p 
< .01). On the tense measure only, score 
increases from base line were greater for 
prompted than for unprompted modeling 
subjects, but the prompted modeling group 
declined more sharply from imitation to 
generalization phases than did their un- 
prompted counterparts (smaller F = 9.46, 
df = 1/72, p < .01), consistent with the 
Prompting X Phase interactions described 
earlier. 


Relationships Among Dependent Measures 


To assess the degree of covariation 
among the response parameters, Pearson 7s 


TABLE 4 
Summary OF ANALYSES or VARIANCE FOR FUTURE VARIATION 


Response parameter 


Effect analyzed. Sentence structure Content 
df F df F df F 

Modeling 1/36 13.72*** 1/36 18.39*** 1/36 13.45*** 
Prompting 1/36 <1 1/36 2.65 1/36 7.64%" 
Modeling X Prompting | 1/36 3.61 1/36 5.25* 1/36 5.11* 
Across phases 2/72 16.38*** 1/36 26.71*** 2/72 23.79*** 
Modeling x Phases 2/72 | 15.52*** 1/36 42.00*** 2/72 15.69*** 

*p < 05. 

**p < 01. 
"*» < 001. 
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TABLE 5 


CORRELATION COEFFICIENTS AMONG RESPONSE 
PARAMETERS BY PHASE AND EXPERIMENT 


Experimental variation 
Parameters 
Present | Past Future 
Base-line phase 
Structure X Content | .33 —.09 .26 
Structure X Tense 9*5. .49* .00 
Content X Tense .28 .10 .00 
Imitation phase 
Structure X Content | .66** .62** .60** 
Structure X Tense E .52* | —.12 
Content X Tense E 07 —.16 
Generalization phase 
Structure X Tense .78** .36 .H 
**p < 01, 


were computed for the modeling groups at 
each phase, separately within experimental 
variations. These coefficients are presented 
in Table 5. 

It can be seen that a broad range of yal- 
ues was obtained, with a median r of .33, 
suggesting that the several response meas- 
ures did not covary greatly or consistently. 
It is of interest that the widest range of 
relationships involved the tense parameter, 
with the highest values found when the 
model used the familiar, present, verb form, 
and the largest negative rs found with the 
much rarer future construction, Further, the 
only significant differences across experi- 
mental variations among correlations be- 
tween like measures also involved tense, 
along with structure. Thus, in base line, the 
rs between structure and the present and 
future tenses differed (p« 01) ; in imita- 
tion, the rs between structure and the im- 
perfect and future tenses differed (p< 
05); and at generalization, the Structure x 
Present Tense coefficient was greater than 
those of the past and future tense variations 
(both ps « .05). 
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Discussion 


Observing the model substantially in. 
creased the frequency of all response classes 
studied and, in several instances, there was 
clear evidence of generalization of gram- 
matical rubrics to new stimuli without any 
further training. It appeared that stronger 
structure generalization occurred when rela- 
tively familiar (present and imperfect) 
verb forms were modeled than with the fu- 
ture tense, which was rarely produced in 
base line, Similarly, providing prompts as- 
sisted imitative tense usage in the future 
variation, but did not affect the more famil- 
lar verb forms. It was also shown that the 
Tesponse measures were relatively inde- 
pendent of each other, and did not merely 
reflect a global copying response. From a 
practical standpoint, observational training 
appeared to be a promising vehicle for rap- 
idly increasing and instating selected fea- 
tures of language, that can then be further 
stabilized by such techniques as practice, 
discrimination training, or reinforcement. It 
further seemed encouraging that the results 
were obtained with children for whom Eng- 
lish was not the original spoken language. 
This conclusion has subsequently been sup- 
ported by related research with a considera- 
bly older Mexican-American sample (Ro- 
senthal & Carroll, 1972). 

The present procedures differed from 
those of Bandura and Harris (1966), and 
Odom, Liebert, and Hill (1968) in a num- 
ber of respects: Most notably, in the prior 
studies, (a) the children were second grad- 
ers; (b) the child and model alternated re- 
sponse in five-trial blocks; (c) the model 
produced the relevant constructions on 75% 
of trials, and displayed prepositions and 
passives in separate stages; and (d) unlike 
the pictures presently used, no nonverbal 
stimulus props accompanied the model's ut- 
terances. It thus appears of interest that the 
previous research only obtained significant 
increases in the desired constructions when 
reinforcement was dispensed. No extrinsic 
reinforcement was provided in the present 
design, making the procedures comparable 
to a condition that, before, had failed to 
elicit significant imitative change. Taken 


' together, the three studies under discussion 
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have shown that children increased their 
use of natural language to accord with the 
rules governing a model’s productions. None 
of these studies bears on the de nouveau 
acquisition of natural grammatical rules, 
put it is a step forward to have shown that, 
once elicited by observation, rule-governed 
response could be generalized to new stim- 
uli, and that modeling could simultaneously 
convey several attributes of language. 

Many psycholinguists contend that imi- 
tative learning plays only a small role in 
establishing language behavior. In extreme 
form, and drawing on conclusions from 
quasi-naturalistio data, this position has 
been characterized by Slobin (1968) as 
casting "strong doubt on the supposition 
that imitation can be considered a useful 
explanatory device in accounting for the 
child's syntactic development [p. 438].” 
Perhaps more typical than the writers de- 
scribed by Slobin would be Chomsky’s 
(1964) viewpoint that: 


In general, it is a mistake to assume that—past 
the earliest stages—much of what the child acquires 
is acquired by imitation. This could not be true 
on the level of sentence formation, since most of 
what the child hears is new and most of what 
t e past the very earliest stages, is new 


These writers appear to define "imitation" 
as simple mimicry of the model's utterances 
and, by such a narrow definition, one might 
agree that imitation plays a small role in 
language learning. However, the view of 
imitation currently held in social learning 
theory (see Bandura, 1969, 1971) is consid- 
erably broader and provides for the obser- 
vational transmission of abstract paradigms 
and other rule-governed categorical behav- 
lor. 

Once the rule-governed nature of lan- 
guage is acknowledged, little real dispute 
Appears to remain between psycholinguists 
interested in language acquisition (exelud- 
ing those who hold an extreme, innately 
Programmed view of language organiza- 
tion), and psychologists interested in obser- 
Vational learning. Any adequate explana- 
tion of language will need theoretical desig- 
hation of what is learned, and specification 


of the information-processing functions that , 
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organize language inputs. A social learning 
framework can clarify the extralinguistic 
factors determining language performance 
such as reinforcement consequences, the 
affective relationship between teachers and 
learners, the discriminability of critical lin- 
guistic features, the refinement or selective 
enhancement of subclasses of rules the child 
may possess, and procedural variables con- 
cerning the format of language displays. 


REFERENCES 


Bawpuma, A. Influence of models’ reinforcement 
contingencies on the acquisition of imitative 
responses. Journal of Personality and Social 
Psychology, 1965, 1, 589-595. 

Banpura, A. Principles of behavior modification. 
New York: Holt, Rinehart & Winston, 1969. 

Banpura, A. Social learning theory. New York: 
General Learning Press, 1971. 

Bawpuma, A., Grusec, J. E, & Menuove, F. L. 
Some social determinants of self-monitoring 
reinforcement systems. Journal of Personality 
and Social Psychology, 1967, 5, 449-455. 

Banpura, À., & Harris, M. B. Modification of syn- 
tactic style. Journal of Experimental Child Psy- 
chology, 1966, 4, 341-352. 

Banpura, A., & McDonaw, F. J. The influence of 
social reinforcement and the behavior of models 
in shaping children’s moral judgments, Journal 
of Abnormal and Social Psychology, 1963, 67, 
274-281. 

Banoura, A., & Miscuen, W. Modification of self- 
imposed delay of reward through exposure to live 
and symbolic models, Journal of Personality and 
Social Psychology, 1965, 2, 698-705. 

Banvura, A., & RosENTHAL, T. L. Vicarious classi- 
eal conditioning as a function of arousal level. 
Journal of Personality and Social Psychology, 
1966, 3, 54-62 

Cuomsxy, N. 
Mouton, 1957. $ 

Cnowsxy, N. Formal discussion. In U. Belugi & 
R. W. Brown (Eds.), The acquisition of language. 
Monographs of the Society for Research in Child 
Development, 1964, 29, (1, No. 92). 

Ervin, S. M. Imitation and structural change in 
children's language. In E. H. Lenneberg (Ed.), 
New directions in the study of language. Cam- 
bridge: Massachusetts Institute of Technology 
Press, 1964. 

Menvox, P. Alteration of rules in children’s gram- 
mar. Journal of Verbal Learning and Verbal Be- 
havior, 1964, 3, 480-488. 

Mies, G. A. Some preliminaries to psycho- 
linguistics. American Psychologist, 1965, 20, 15- 
20. 

Onom, R. D., Limpert, R. M, & Huw, J. H. The 
effects of modeling cues, reward, and attentional 
set on the properties of grammatical and ungram- 


‘Syntactic structures. The Hague: 


596 . W. R. CARROLL, T. L. ROSENTHAL, AND C. G. BRYSH 


matical syntactic construction. Journal of Ez- parameters. Journal of Educational Psychology, 

perimental Child Psychology, 1968, 6, 131-140. 1972, 63, 174-178. 1 
Rosenman, D., & Wzurg, G. M. Observation and Stony, D. I. Imitation and grammatical develop. 

rehearsal as determinants of prosocial behavior. ment in children. In N. S. Endler, L. R, Boulter, 

Journal of Personality and Social Psychology, & H. Osser (Eds.), Contemporary issues in de. 

1967, 5, 424-431. velopmental psychology. New York: Holt, Rine- 
RosewrHAL, T. L, & Cannon, W. R. Factors in bart, & Winston, 1968. 

vicarious modification of complex grammatical (Received July 9, 1971) 


Educational P. 
63, No. 6, 597-002 
EVALUATION OF THE STANFORD CAI PROGRAM 
IN INITIAL READING 
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Stanford University 


Reading achievement of pupils who received computer-assisted in- 
struction in initial reading was compared with reading achievement of 
pupils who did not. Twenty-two pairs of first-grade boys and 22 pairs 
of first-grade girls were matched on the basis of Metropolitan Readi- 
ness Test scores. Three posttests were administered: the Stanford 
Achievement Test, the California Cooperative Primary Test, and an 
individually administered test designed to measure directly the prin- 


for boys. 


puter-assisted instruction (CAI) in 
reading (kindergarten through the 
grade) has been under development at 
ford University since 1964. Our original 
‘was to implement a complete CAI cur- 
culum that would present most, if not all, 
initial reading instruction under compu- 
control and would depend only mini- 
y on classroom teaching. These early 
forts were successful (Atkinson, 1968), 
the high cost of the program made it 
kely that it could be implemented on a 
de-scale basis in the forseeable future. In 
ddition, our early research indicated that, 
e some aspects of reading instruction 
be handled extremely well and inex- 


at the student terminal, and the charac- 
set of the teletypewriter is limited to 
ercase letters. These are strong con- 


cipal goals of the computer curri 

f matched pairs was maintained to 
p computer instruction resulted in significant posttest 
the improvements were not limited to the phonics-oriented goals of 
the computeer curriculum. The data suggested that computer instruc- 
tion benefits both girls and boys, but that it is relatively more effective 


ints from a curriculum viewpoint, but ' 
507 


culum. Separation of girl and boy 
allow cross-sex comparisons. The 


gains; further, 


they are partially offset by an extremely 
flexible audio system. Audio is stored in 
digitized form on magnetic disks, and the 
system provides, on a time-shared basis, 
rapid (30 milliseconds) random access to 
any one of 6,000 recorded words. 

Reading instruction may be divided into 
two basic tasks variously referred to as de- 
coding and communication. For present 
purposes, decoding is defined as the rapid, if 
not automatic, association of phonemes or 
phoneme groups with their respective 
graphic representations; communication is 
defined as reading for meaning, aesthetic 
enjoyment, emphasis, and the like. The 
major emphasis of the Stanford CAI pro- 
gram is on decoding skills, although work 
on word and sentence comprehension is also 
included. Instruction is divided into seven 
content areas or strands. Strand I, the read- 
iness strand, provides practice with the 
manual skills required for interaction with 
the CAI program and instruction on a series 
of fairly standard “reading readiness” 
tasks. Strand II, the letter strand, provides 
practice in copying, recognition, and recall 
of the letters of the alphabet. The initial 
pass through the alphabet presents letters 
singly and in maximally contrasting groups, 
for example (RTO); later passes through 
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the alphabet present letters in minimally 
contrasting groups, for example (MNW). 
Strand IIT, the word strand, provides for 
the development of a sight word vocabu- 
lary. Seven kindergarten through third- 
grade reading vocabulary lists were ana- 
lyzed in developing this strand. Of the 
words used in Strands III through V, those 
that do not include regular grapheme-pho- 
neme correspondences are presented only in 
this strand. Strand IV, the spelling pattern 
strand, provides for recognition and recall 
of orthographically regular monosyllabic 
words arranged in groups which emphasize 
a single spelling pattern, for example (ran, 
fan, man) or (fat, fan, fad). Strand V, the 
phonies strand, provides for direct practice 
in copying and recognition of the spelling 
patterns themselves as well as the “con- 
struction” of monosyllabic words from 
given consonant clusters and spelling pat- 
terns. Strand VI, the comprehension cate- 
gories strand, attempts to provide practice 
with the meaning of words by emphasizing 
the semantic categories of words. Exercises 
in this strand ask the student to select the 
word of those displayed that is an animal or 
that is a color, etc. Strand VII, the compre- 
hension sentences strand, provides practice 
in reading sentences by requiring the stu- 
dent to select a word to fill an empty “slot” 
in the sentence. On any given day, a stu- 
dent's lesson may involve exercises drawn 
from one to five different strands. A more 
complete description of the program as well 
as the rationale underlying it is presented in 
Atkinson, Fletcher, Chetin, and Stauffer 
(1971), and Atkinson and Fletcher (1972) ; 
cost considerations are discussed in Jami- 
son, Fletcher, Suppes, and Atkinson (1973). 

The CAI program is designed to permit 
each student to progress at his own rate 
through a sequence of materials that max- 
imizes his progress. Our approach has been 
to formulate mathematical models for the 
acquisition of various decoding skills and 
use these models to specify optimal proce- 
dures. These procedures require that a rap- 
idly accessible Tesponse history be main- 
tained for each student. As the student pro- 
gresses through the curriculum, his history is 


continually updated and interrogated in or- « 
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der to specify the curriculum items to be 
presented next. A discussion of optimization 
procedures developed for the CAI reading 
program can be found in Atkinson and 
Paulson (1972). 

The first tryout of the program occurred 
in the 1968-1969 school year with students 
in kindergarten and Grades 1, 2, and 3. As 
expected, many problems of curriculum de- 


sign and system operation were identified ` 


and had to be corrected during this period. 
By the summer of 1969, however, the sys- 
tem and curriculum had stabilized to a sat- 
isfactory level of operation, and an evalua- 
tion of the program was undertaken during 
the 1969-1970 school year. The purpose of 
this article is to summarize briefly some of 
the more important findings obtained from 
the evaluation. 


Mersop 


The problems of evaluating a new curriculum 
are many, and it is difficult if not impossible to deal 
with all of them. The design adopted for this 
evaluation has its faults, but within the economic 
and administrative constraints of the situation it 
appeared to be a reasonable choice. A matched- 
pairs design was used in which compensation for 
possible differences between experimental and con- 
trol groups is achieved by matching on the basis of 
pretest scores, 

Although over 100 students were run for vary- 
ing periods of time on the CAI reading curriculum, 
the evaluation was limited to a group of 50 
matched pairs. Prior to receiving any exposure to 
CAI, 25 pairs of first-grade boys and 25 pairs of 
first-grade girls were matched on the basis of the 
Metropolitan Readiness Test (MRT). The MRT 
was administered in October to groups of 10 or less 
pupils by trained test personnel. The Numbers and 
Draw-A-Man subtests of the MRT were not ad- 
ministered. Matching was achieved so that the 
MRT scores for a matched pair of students were 
no more than two points apart. Moreover, in 
matching the students an effort was made to insure 
that both members were drawn from comparable 
classrooms with teachers of equivalent ability, The 
mean MRT score for the boys participating in the 
evaluation was 56.6 and the mean for the girls was 
55.1 


The experimental member of each matched pair 
of dtuiten te received 8 to 10 minutes of CAI in- 
struction per school day roughly from the first 
Week in January until the second week in June. 
The control member of each pair received no 
CAI instruction. Except for the 8- to 10-minute 
CAI period there is no reason to believe that = 
activities during the school day were any differen 
for the experimental and control subjects. 


| 


| 
| 


Three posttests were administered to all sub- 
jects in late May and early June, 1970. Four sub- 
tests of the Stanford Achievement Test, Primary 
I, Form X, were used. These subtests were: word 
reading (S/WR), paragraph meaning (S/PM), 
vocabulary (S/VOC), and word study (S/WS). 
Second, the California Cooperative Primary Read- 
ing Test (COOP), Form 12A (Grade 1, spring) 
was administered. Only the total raw scores were 
used from this test. Both the Stanford Achievement 
Test and the COOP were administered to class- 
room groups by teachers under the supervision of 
district testing personnel. Finally, a test (DF) 
developed at Stanford and tailored to the goals of 
the CAI reading curriculum was administered indi- 
vidually to all subjects. The DF items fell into 
eight groups yielding the following eight subtests: 
uppercase letters (D/LU), lowercase letters (D/ 
LL), uppercase words (D/WU), lowercase words 
(D/WL), spelling patterns (D/SP), monosyllabic 
words comprising these spelling patterns (D/SW), 
and nonsense monosyllables comprising these 
spelling patterns (D/SN). The words for the D/ 
WU and D/WL subtests were chosen at random 
from first-grade vocabulary lists. The spelling pat- 
terns for the D/SP, D/SW, and D/SN subtests 
had all been taught in the CAI curriculum, but 
none of the words or nonsense syllables in the 
D/SW and D/SN subtests had been taught. In 
administering the test, an item printed in primary 
type on a 3 X 5 index card is shown to a subject 
who then has 10 seconds to read the item aloud. 
The CAI subjects were expected to score higher 
on all subtests of DF than were the non-CAI sub- 
jects, since each subtest represented a specific goal 
of the curriculum. However, because the Model 
33 teletypewriters did not provide for lowercase 
printing, the CAI subjects’ gains were expected to 
be greater for the two uppercase subtests (D/LU 
and D/WU) than for the related lowercase sub- 
tests (D/LL and D/WL). The three spelling pat- 
tern subtests were all presented in uppercase, So 
the question of upper-versus lowercase did not arise 
for these presentations. 

Some predictions can also be made for the 
Stanford Achievement Test subtests. The greatest 
difference between CAI and non-CAI subjects was 
Predicted for the S/WS since this subtest deals 
directly with sounds and spelling patterns (Kelley, 
Madden, Gardner, & Rudman, 1964). Very little 
difference between CAI and non-CAI subjects was 
expected for the S/PM subtest. Teletypewriter 
Presentation is not suited to large amounts of text 
output in an instructional situation, and there were 
no paragraph exercises in the CAI curriculum. 


n were kept separate to see if this result would 
old for the current evaluation. 
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TABLE 1 


Means, STANDARD DEVIATIONS, AND ( VALUES FOR 
THE STANFORD ACHIEVEMENT TEST, THE 
CALIFORNIA Cooperative PRIMARY TEST, 
AND THE CAI RzapiNG Prosect Test 


Subjects Measure | SAT COOP DF 
CAI 
Boys | M 109.7 33.2 64.9 
SD 24.1 8.6 7.0 
m 3.60* 4.70* 7.01* 
Non-CAI 
M 90.2 23.4 53.0 
SD 19.5 8.9 10.4 
CAI 
Girls | M 115.7 33.7 64.1 
SD 26.2 10.4 7.6 
t 2.55* 1.65 3.10* 
ict nl AETERNA MIENNE 3 
Non-CAI 
M 96.5 28.9 56.6 
SD 30.5 10.8 13.5 


*p < .01 (df = 21). 


RESULTS AND DISCUSSION 
During the course of the school year, an 


equal number 


of pairs was lost from the 


female and male groups; complete data 
were obtained for 22 pairs of boys and 22 
pairs of girls. Means, standard deviations, 
and £ values for differences in Stanford 
Achievement, COOP, and DF total scores 
are presented in Table 1 for the matched 
pairs of boys and the matched pairs of girls. 

The results of these analyses are hearten- 


ing. Of the six 


posttest comparisons, only 


one (COOP for matched pairs of girls) 
failed to indicate a significant difference in 
favor of the CAI reading subjects. These 


differences are 
standpoint of 


also important from the 
improvement in estimated 


grade placement. Table 2 displays the mean 
grade placement of the four groups on the 
Stanford Achievement test and COOP. The 
differences between CAI and non-CAI 
groups in estimated grade placement range 


. from .4 to .7 school years. Means, standard 
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TABLE 2 
'AvERAGE GRADE PLACEMENT ON THE STANFORD 
ACHIEVEMENT TEST AND THE CALIFORNIA 
CooprzRaTIvE Primary Test 


Subjects SAT COOP 
Boys 
CAI 2.2 2.5 
non-CAI 1.8 1.8 
Girls 
CAI 2.4 2.6 
non-CAI 2.0 2.2 


deviations, and £ values for the differences 
on the four Stanford Achievement subtests 
are presented in Table 3 for male and fe- 
male matched pairs. 

"These Stanford Achievement subtests re- 
veal some interesting results. The S/WS 
differences are fairly large as expected, but, 
in the case of the boys, they fall slightly 
short of statistical significance (p < .01). 
Of the four Stanford subtests, the S/WS 
was expected to reflect most clearly the 
goals of the CAI curriculum; yet greater 
differences between CAI and non-CAI 
groups were obtained for both the S/WR 
and 8/PM subtests. Also notable is the lack 
of any real differences for the S/VOC. One 
explanation for this result is that the vocab- 
ulary subtest measures a pupil's vocabulary 
independent of his reading skill (Kelley, et 
al, 1964); since the CAI reading curricu- 
lum is primarily concerned with Treading 
skill and only incidentally with vocabulary 
growth, there may have been no reason to 
expect a discernable effect of the CAT curric- 
ulum on the S/VOC. Most notable, how- 
ever, are the S/PM results. In both the 
male and female groups, the CAI students 
performed significantly better on paragraph 
items than did the non-CAI Students, de- 
spite the absence of paragraph items in the 
CAI program and the relative dearth of 
sentence items. These results for phonics- 
oriented programs are not unprecedented as 
Chall’s (1967, pp. 106-107) survey shows. 
Nonetheless, for a program with so little 
emphasis on connected discourse, they are 
surprising. 

Means, standard deviations, and t values 


for five of the seven DF subtests are pre- ` 


J. D. FLETCHER AND R. C. ATKINSON 


sented in Table 4 for male and female 
matched pairs. All subjects obtained perfect 
or nearly perfect scores on the DF letter 
subtests (D/LL and D/LU); the male and 
female CAI groups obtained slightly higher 
scores on the subtests than did their non- 
CAI counterparts, but the performance was 
so high and the variability so low that com- 
parisons are not justified. These data sug- 
gest that a lesser emphasis on letter teach- 
ing in the CAI program is in order, and that 
the time devoted to letter teaching should 
be shifted to other strands of the curricu- 
lum. 

The DF word subtests presented in Table 
4 show the expected superior performance 
of the CAI groups on the uppercase presen- 
tations (D/WU). However, the differences 
favoring the CAI groups for lowercase 
presentations are also statistically signifi- 
eant though of a lesser magnitude than 
those for uppercase presentations. Evi- 


TABLE 3 
Means, STANDARD DEVIATIONS, AND t VALUES FOR 
THE Worp Reapina (S/WR), PARAGRAPH 
MzawiNG (S/PM), Vocanutary (S/VOC), 
AND Word Stupy (8/WS) SunTESTS OF 
THE STANDARD ACHIEVEMENT TEST 


S/VOC| S/WS 


Subjects | Meas- 


S/WR 


Girls M | 27. 
6 
3 


*p < 01 (df = 21). 
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TABLE 4 


Means, STANDARD DEVIATIONS, AND (- VALUES FOR THE UrPERCASE Worps (D/WU), Lowercase WORDS 


(D/WL), SeeLLING Parrerns (D/SP), SPELLING Parrern Wonps (D/SW), AND SPELLING 
PATTERN NONSENSE SYLLABLES (D/SN) Susrests or THE CAI READING PnojkcT Test 
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Subjects Measure D/WU D/WL | D/SP | D/SW D/SN 
CAI 
Boys M 10.4 9.9 7.0 7.0 6.6 
SD 2.1 1.6 1.5 1.4 1.6 
t 5.20* 4.72* 3.62* 4.91* 5.80* 
Non-CAI 
M 7.4 7.3 5.5 5.1 4.4 
SD 2.7 2.6 2.1 1.9 2.0 
CAI 
Girls M 10.5 10.2 7.4 6.3 6.5 
SD 1.7 1.9 1.0 1.8 2.0 
t 3.37* 2.63* 3.35* 1.44 2.56* 
Non-CAI 
M 8.3 8.7 5.5 5.5 4.8 
SD 3.2 2.8 2.7 2.9 2.8 


*p < 01 (df = 21). 


dently, the lack of lowercase letters in the 


| CAI program is a handicap of only minor 


Importance. 

The DF spelling pattern subtests reflect 
goals that are at the heart of the CAI curric- 
ulum, The greatest effect of the curriculum 
was expected to be on the ability of stu- 
dents to recognize spelling patterns (D/ 
SP), to pronounce orthographically regular 
but unfamiliar words comprising the spell- 
ing patterns (D/SW), and to pronounce or- 
thographically regular nonsense syllables 


| comprising the spelling patterns (D/SN). 
| Five of the six obtained differences on these 


Spelling pattern subtests are statistically 
Significant. 

It was expected that some measure of 
Performance on the system would correlate 
fairly highly with the Stanford Achieve- 
Ment, COOP, and DF total scores. These 
Correlations are presented in Table 5 .The 
Measure of performance on the CAI curric- 
lum used here is the total number of cur- 
Neulum units brought to criterion by a stu- 
dent divided by the total amount of time 


Accumulated on the system. Note that the ' 


correlations are particularly high for the 
Stanford Achievement and COOP scores. 
Oddly, the smallest correlations are with 
the DF scores which were expected to reflect 
more closely than the Stanford Achievement 
and COOP the goals of the curriculum. 

It is interesting to examine the effect of 
CAI on the progress of the boys compared 
to the progress of the girls. The results pre- 
sented in Table 1 seem to corroborate the 
Atkinson (1968) finding that boys benefit 
more from CAI instruction than do girls. 
For both the Stanford Achievement and 
COOP tests, the girls are superior to the 


TABLE 5 
CORRELATIONS OF THE NUMBER or CAI READING 
ITEMS COMPLETED PER Unit TIME WITH THE 
STANDARD ACHIEVEMENT TEST, THE 
CALIFORNIA COOPERATIVE PRIMARY 
TesT, AND THE CAI READING 
PnojzrcT Test 


Subjects SAT COOP DF 
Boys T4 .68 .48 
Girls .84 47 .49 


602 


boys, but for the non-CAI group the size of 
the difference is greater than for the CAI 
group. On the Stanford Achievement Test, 
the relative improvement for boys exposed 
to CAI versus those not exposed to CAI is 
22%; the corresponding figure for girls is 
20%. On the COOP, the percentage improve- 
ment due to CAI is 42 for boys and 17 for 
girls. Finally, on the DF, the percentage im- 
provement is 32 for boys and 18 for girls, 
Overall, these data suggest that both boys 
and girls benefit from CAI instruction in 
reading, but that relatively CAI is more ef- 
fective for boys. Explanations of this differ- 
ence are discussed in Atkinson (1968). 


CoNCLUSION 


"The daily, 8- to 10-minute CAI sessions 
in initial reading yielded improvements on 
posttest performances that were significant 
from both a statistical and practical stand- 
point. These improvements were not limited 
to the specific, phonics-oriented goals of the 
CAT curriculum, but include improvements 
in more general reading skills related to 
sentence and paragraph comprehension. 

In evaluating these results it should be 
kept in mind that the experimental group 
received CAI for only 5¥% months. The fact 
that the observed differences are so sub- 
stantial suggests that the CAI treatment 
administered over several years could well 
have dramatic results. Although we have no 
systematic data on students who have been 
on the program for several years, the size of 
the effects observed in this short-term study 
are in accord with our impressions of the 
improvements achieved by students with 
more extended experience. 

Other analyses of data have been run but 
will not be reported here. Our main purpose 
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in this paper is to briefly report a few of the 
more important results from the evaluation, 
but not to offer any firm conclusions or in- 
terpretations. We recognize that some read- 
ers will not be happy with our multiple use 
of paired ¢ tests. They are presented not as 
definitive measures of statistical signifi- 
cance but rather as rough indexes of the 
influence of CAI on the various dependent 
measures. Also, we expect that there will be 
readers interested in analyzing aspects of 
these data not reported here. With this in 
mind, we have recorded our data in a for- 
mat that can be readily sent to those who 
request it. 
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server and a nonconserver. These subjects 
their thinking on two of the pretest 


answer between them for each of the 
lucted 1 month later. Ten conservers 


no peer interaction experience but 


tasks used in the pretest was cond 


Group problem solving is of potential in- 
terest to educators for several reasons. One 
reason is that experience in problem-solving 
groups may serve to stimulate or facilitate 
problem solving on an independent basis. 

| Indeed, some theorists (e.g, Bales, 1950; 
Piaget, 1932, 1950, 1967) have taken the 
position that problem solving with others is 

| à necessary condition for the development 
of individual thought. Other reasons for ed- 
ucators being interested in group problem 
solving include the following: it shifts to 
the students the responsibility for their own 
learning activity; it provides experience in 
à group task; and it provides an occasion 


1This article is based on a master's thesis com- 
atad by the second author under the direction of 
e first author. Preparation of the manuscript was 
supported by a grant from the United States Office 
of Education, Department of Health, Education, 
and Welfare to the first author. The opinions ex- 
Pred hergin, however, do not n ily reflect 
Ed Position or policy of the United States Office of 
Ui lucation, and no official endorsement by the 
nited States Office of Education should be in- 
pem The authors wish to thank the Toledo Pub- 
c Schools for cooperating in this study. 
Wi Requests for reprints should be sent to Irwin 
B illiam Silverman, Department of Psychology, 
owling Green State University, Bowling Green, 
Ohio 43403. 


for increasing interpersonal sensitivity and 
communication (Gorman, 1969). 

Undoubtedly, the first of these reasons is 
the most significant in terms of its implica- 
tions for not only educational practice but 
also developmental theory. The present 
study, however, does not constitute & test of 
Bales or Piagets hypothesis. Instead, we 
have pursued a much narrower objective: to 
investigate the question of whether experi- 
ence in à problem-solving group (more spe- 
cifically, a dyad) effects relatively perma- 
nent and generalizable changes in the cogni- 
tive functioning of the participants. 

Though voluminous, the literature on 
group problem solving (see the reviews by 
Hartup, 1970; Kelley & Thibaut, 1969) of- 
fers practically no evidence on the question 
posed in this study. Previous researchers 
have been more intrigued by the group as a 
social system, and consequently they have 
concerned themselves with questions such 
as how people behave in groups and how 
group products compare 


with individual 
products. Parenthetically, we might men- 
tion that the problems usually selected for 
investigation are so artificial as to be quite 
outside the domain of classroom learning. 


Moreover, in many cases there is little rea- 
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Son to believe that the problems studied 
(such as solving arithmetic problems, word 
puzzles, jigsaw puzzles, completing limer- 
ieks, finding antonyms, and the like) re- 
quire any new cognitive structuring on the 
part of the subjects. 

In the present study children who con- 
Served area and those who did not were 
given the task of reconciling their differing 
views. It was predicted that conservers 
would prevail over nonconservers during the 
interaction. According to Piaget’s (1967) 
theory of conservation, the conserver should 
be more resistant to change because conser- 
vation implies the constitution of mental 
operations into an equilibrated structure, 
such that different transformations of the 
stimuli can be anticipated and compensated 
in thought. It was also predicted that the 
conserver would tend to be the major initia- 
tor of the interactions. This prediction fol- 
lows from Piaget’s conception of a cognitive 
equilibrium as engendering a feeling of ne- 
cessity and coherence. 

Another feature of this study was the in- 
clusion of a posttest, given about 1 month 
after the interaction experience. The pur- 
pose of the posttest was to determine 
whether the group experience would pro- 
duce lasting and generalizable changes in 
the nonconservers understanding of area 
conservation. No prediction was made re- 
garding the posttest results. The question of 
whether the group experience would pro- 
duce any long-term effects was viewed as 
essentially an empirical matter. 


Morxop 


General Design 

dcin ig children were 
conservation-of-area problems, The children were 
classified into three Broups: conservers, noncon- 
Servers, and transitional conservers. Fourteen pairs 
of subjects were composed, each containing a con- 
server and a nonconserver. They were told of the 


difference in their 


pretested on four 


subjects w 

later posttested on Bobin Admin. 
istered during the pretest. A control group, com- 
posed of 10 conservers and 13 nonconservers, was 
given no interaction experience. The control sub- 
jects were posttested in the same manner as the 
experimental subjects. The pretest and interaction 
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phases were administered by one experimenter and 
the posttest by another experimenter. 


Subjects 


The subjects were third-grade children attend- 
ing two public elementary schools. The schools 
were located in suburban, white, middle-socioeco- 
nomic-class areas. The experimental subjects were 
all drawn from one school and the control subjects 
were all drawn from the other school. The decision 
to use separate schools for the experimental and 
control groups was based on the fact that fewer 
subjects could be selected for the interaction se- 
quence if both schools contributed to the experi- 
mental and control groups. (Since the study was 
carried out at the end of the school year, there was 
no time available to work at additional schools.) 


Pretest 


The pretest problems were derived from Piaget, 
Inhelder, and Szeminska (1960), and had as a com- 
mon feature that two figures could be made equal 
in area by rearranging their parts. Two problems 
involved the rearrangement of squares into differ- 
ent configurations; the other two problems in- 
volved the rearrangement of a triangle and a trap- 
ezoid into different configurations. The general 
testing procedure was as follows. The subject was 
confronted with two identical figures. He was asked 
to ascertain that the two figures covered the same 
amount of space and was encouraged to superim- 
pose, manipulate, or measure the figures to insure 
his agreement as to their equality in area. The sub- 
ject then watched as the experimenter rearranged 
the parts of one figure. The subject was asked to 
judge whether the amount of space covered by 
the altered figure was the same as that covered by 
the unaltered figure. After giving his judgment, 
the subject was asked to explain his answer. The 
problems were presented in a random order, sepa- 
rately determined for each subject. 

The subjects were classified into one of three 
groups based on their pretest performance. If a 
conserving answer and an adequate explanation 
were given on all four conservation-of-area tasks, 
the subject was classified as a conserver. If no con- 
serving answers were given, the subject was classi- 
fied as a nonconserver. Subjects who deviated 
from either of the above classifications were classi- 
fied as transitional conservers and dropped from 
the study. The criteria for an adequate explana- 
tion corresponded to those by Gelman (1967). Two 
independent judges who categorized a random sam- 
ple of 32 explanations obtained 100% agreement. 


Interaction Sequence 


The interaction phase of the experiment li 
conducted within 1 week of the pretest. The su 
jects participating in the interaction phase were 
matched for sex (excepting one pair) and their 
having been together in the same class since the 

* beginning of the school year. The sex breakdown 
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of the pairs was as follows: boys—10 pairs; girls— 
3 pairs; mixed—l pair. (The greater number of 
boy than girl pairs was purely accidental.) The in- 
structions were as follows: 


The other day I asked you some questions. Do 
you remember? I had two pieces of paper that 
covered the same amount of space, just like these 
two. Do these cover the same amount of space? 
Then I took one piece of paper and made it 
look like this (transform the figure). ——, you 
said that it was not the same. You said that this 
one covered more space. —, you said that this 
one covered the same amount of space as that 
one. You don't agree, do you? I want you to try 
to agree on one answer. Talk to each other and 
use the paper if you want to. Remember, you are 
both to agree on one answer. I'll just listen and 
I won't say anything while you are talking with 
each other. When you have finished talking de- 
cide which of you will give the answer. Okay? Do 
you have any questions? Remember, talk it over 
until you agree on one answer. 


The same two problems were used for all interact- 
ing pairs (squares and trapezoid). The interactions 
were all videotaped, but the poor quality of the 
audio Abate prevented detailed analysis of these 
records. 


Posttest 


The posttests were administered about a month 
after the pretest. The experimenter was “blind” 
with respect to the classification of the subjects, 
the identification of the yielders in the interaction 
Sequence, and the fact that different schools sup- 

ied the experimental and control subjects. A 
safeguard against the experimenter becoming aware 
of the difference between schools was provided by 
having him test both conservers and nonconservers 
at the control school. 


RESULTS 


Interaction Sequence 


It was predicted that conservers would 
tend to take the lead in initiating the inter- 
action. There was a significant difference in 
the predicted direction on the first task of 
the interaction. The conservers were the ini- 
tiators 11 out of 14 times (p < .05, Fisher's 
exact test). In the second task the conserv- 
ers were the initiators 8 out of 14 times, 
Yielding a nonsignificant difference between 
their performance and that of the noncon- 
servers (p > .05, Fisher's exact test). 

The direction of change in conservation 
test responses as a result of the interaction 
Was of primary interest. The conserver pre- 
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vailed 11 times on both of the tasks. In one 
remaining pair, the conserver yielded to the 
nonconserver on both tasks. In another re- 
maining pair, the subjects agreed on the 
conserving answer on the first problem, but 
in the second problem, they agreed on the 
nonconserving point of view. In the final 
pair, the subjects could not come to agree- 
ment and the decision was left unresolved. 
In order to test the prediction that conserv- 
ers would prevail over nonconservers during 
the interaction, it was decided not to in- 
clude in the analysis the pair in which the 
subjects stalemated. If then one compares 
the expected proportions of conservers pre- 
vailing on two interactions (V$) versus one 
or zero interactions (3) with the obtained 
proportions, the difference is highly signifi- 
cant (a? = 24.92, p < .001) in the direction 
predicted. 

The explanations offered to the noncon- 
servers by the conservers during the inter- 
action sequence were tallied. It was noted 
that in every instance in which the noncon- 
server yielded to the conserver, the former 
explained the answer with an explanation 
provided by the latter. 


Posttest 


The subjects were classified into conserv- 
ers, nonconservers, and transitional con- 
servers using the same criteria as on the 
pretest, but this time using all four prob- 
lems as a basis for classification. Transi- 
tional conservers were classified in two 
ways: as having changed or not having 
changed from their pretest positions. The 
conclusions are the same with either scoring 
method, so the results for only the second 
procedure will be reported here. 

The results for the experimental group 
are as follows. Eleven nonconservers were 
reclassified as conservers on the posttest. 
All of these subjects had agreed to the con- 
serving answer on both problems during the 
interaction. Only one conserver was reclas- 
sified as a nonconserver on the posttest. 
This subject had agreed to the nonconserv- 
ing answer on both problems during the in- 
teraction. The shift in classification be- 
tween pre- and posttest was significant for 


- noneonservers (z = 3.32, p < .001), but it 
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was nonsignificant for conservers (z — 1.00, 
p < 25). 

Changes within the control group from 
pre- to posttest were meager. Only one sub- 
ject, a nonconserver, was classified as hav- 
ing switched his point of view. 

Finally, the difference between the exper- 
imental and control groups in the propor- 
tion of nonconservers who were classified as 
conservers on the posttest was evaluated. 
The resulting difference was significant (z 
= 3.18, p < .05). 

Examination of the explanations given on 
the posttests by the nonconservers who had 
yielded on both problems to the conservers 
revealed only one instance of an explana- 
tion which had not been provided by the 
conserver during the interaction sequence. 


Discussion 


The prediction that conservers would pre- 
vail over nonconservers during the interac- 
tion received substantial support. More im- 
portant, however, are the findings for the 
posttest. In the first place, it was found that 
all the nonconservers who had yielded to 
the conservers on both problems during the 
interaction still gave conserving responses a 
month later. Second, in these subjects there 
was a 100% transfer of the conservation re- 
sponse to two new tasks which presented 
different configurations but which were solv- 
able by the same rule. Further evidence for 
the generalizability of conservation re- 
Sponses acquired through peer interaction is 
seen in the fact that posttesting was con- 
ducted by: a new experimenter. The findings 
that the responses and explanations were en- 
during and generalizable support the view 
that some kind of cognitive reorganization 
had taken place (Piaget, 1964). Unfortu- 
nately, it is not possible to draw any conclu- 
sion regarding the relative potency of par- 
ticipation in a problem-solving group as a 
conservation induction procedure. According 
to Brainerd and Allen’s (1971) review, there 
has been no prior attempt to train conser- 
vation of area. 

We will now consider some factors that 
are possible explanations for the success of 
the interaction experience as a conservation 
training technique. One is that the interac- 
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tion experience provided nonconservers with 
a verbal algorithm or model (Beilin, 1965) 
for solving conservation problems. This ex- 
planation is consistent with the finding that 
nonconservers justified their answers with 
the same explanations articulated by their 
conserving partners during the interaction 
sequence. Other evidence for the efficacy of 
a verbal model comes from several investi- 
gators (Beilin, 1965; Hamel & DeWitt, 
1971; Peters, 1970; Smith, 1968; Sjóberg, 
Höijer, & Olsson, 1970) who reported suc- 
cessful conservation training outcomes us- 
ing verbal rule instruction as a training 
procedure. Only Mermelstein and Meyer 
(1969) failed to obtain positive results with 
verbal instruction. Sullivan (1967) found 
that just watching an adult explain con- 
servation principles was sufficient to in- 
duce conservation. Sullivan interprets these 
results as supporting the view that lan- 
guage is the foundation of all rational ac- 
tivity, with conservation being a funda- 
mental example of such activity. Beilin 
(1965), however, points out a limitation in 
the effectiveness of a verbal model: there 
is no generalization to nontrained proper- 
ties. He concludes from this fact that “some 
element beyond verbal model training is 
necessary for ‘full’ conservation, which no 
other training procedure is able to provide 
either, but which is achieved in less formal 
learning settings [p. 337].” Specifically, Bei- 
lin found that a large proportion of sub- 
jects who could conserve number and length 
without training could also conserve area. 
In contrast, in subjects who received verbal 
instruction (or two other training proce- 
dures) there was little or no generalization 
to the area concept. i 

A second factor in the interaction experi- 
ence is the mere exposure to a person who 
gives a conservation answer. Sullivan, in 
the aforementioned study, included a condi- 
tion in which the adult model gave conser- 
vation answers but did not give explana- 
tions in terms of conservation principles. 
While most subjects in this condition ac- 
quired and even generalized the fou 
tion response, only a small proportion O 
them could justify their answers with 8 
conservation principle. It would therefore 


‘appear that contradietion by another per- 
son cannot by itself account for the present 
findings in that the interaction experience 
served to induce conservation explanations 
as well as conservation answers. 

A third factor in the interaction experi- 
ence is the source of the conservation view- 
point. Piaget (1926, 1932) has asserted that 
children are less egocentric with other chil- 
dren than with adults. By egocentric Pia- 
get means the centering of one’s viewpoint 
to the exclusion of other viewpoints. Hence, 
the child would be more likely to revise his 
position when the discrepant information 
comes from a child rather than an adult. 
However, as we have seen, the provision of 
a verbal model is a potent procedure for 
inducing conservation whether the com- 
municator is an adult or a child. Nonethe- 
less, peers may on the whole be more effec- 
tive in the role of conveyers of a given con- 
servation concept. Moreover, if a conserva- 
tion principle is explained by a child, there 
may be greater generalization of conserva- 
tion or greater resistance to counter-sugges- 
tion. These are intriguing questions for fu- 
ture research. 

It was found that only those nonconserv- 
ers who capitulated to conservers during the 
| interaction sequence consistently gave con- 
servation responses in the posttest. If the 
Lonconserver is to profit from the interac- 
tion experience, it is important, therefore, 
that he yield to the conserver during the 
interaction sequence. Although Piaget’s 
theorizing concerning conservation suggests 
that factors internal to the person are pri- 
marily responsible for the dominance of 
Conservers over nonconservers, interper- 
sonal factors may in fact weigh more in 
determining who is ascendent within an in- 
teraction situation. The most obvious possi- 
bility is that the participant who prevails 
has the higher perceived intelligence or aca- 
demic ability. A post hoc examination of 

possibility was made with the aid of 
the school records. The California Mental 
Maturity Tests had been administered to 
the subjects approximately 1 year before 
the present study was conducted. The mean 
IQs derived from this test for the conservers 
and nonconservers were 111.83 and 109.08, 
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respectively. This result contradicts the 
perceived intelligence explanation of yield- 
ing. However, a better test of this explana- 
tion would consist of using sociometric de- 
vices to assess the intellectual reputations 
of the subjects. 

A question that may be raised in connec- 
tion with the factors responsible for the 
dominance of the conservers over the non- 
conservers is whether the conservers were so 
intimidating as to coerce nonconservers into 
changing their position. This did not appear 
to be the case, as there were few instances 
of browbeating on the part of the conserv- 
ers. On the other hand, conservers were 
more likely to initiate the interactions, 
which is consistent with the prediction that 
conservers would be more assertive than 
nonconservers. 

In summary, then, the present study 
showed that the conservation responses ac- 
quired through participation in a problem- 
solving group are enduring and generaliza- 
ble. Additional work should be directed to- 
ward confirming and extending the present 
findings. For example, it is of interest to 
determine whether the effects of participat- 
ing in a problem-solving group would gen- 
eralize from one conservation concept to 
another, such as from area to weight or vol- 
ume. In addition, it is of interest to deter- 
mine whether younger children’s conserva- 
tion concepts can be modified through ex- 
perience in a problem-solving group. 
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ORGANIZATION AND STUDY TIME IN LEARNING 


FROM READING? 


MORTON P. FRIEDMAN? ann FRANK L. GREITZER 
University of California, Los Angeles 


College subjects read versions of a reading passage dealing with sev- 
eral fictitious types of fish, organized either by name or attribute. In 
the name-organized passage, all of the attributes of a given fish were 
described before introducing the next fish. In the attribute-organized 
passage, all fish associated with given attribute values were named be- 
fore introducing the next attribute and its values. Half of the subjects 
read the same passage twice; the other half read the passage with the 
alternate organization on their second reading. Other variables investi- 
gated were study time and the syntactic structure of the sentences used. 
in the text. Free-recall, cued-recall, and recognition tests were used to 
assess memory. The main results were that (a) the subjects who read 
the attribute-organized passage in both readings achieved the highest 
recall scores, and (b) the subjects’ free-recall organization generally 


that the attribute-organized passage yielded superior recall because it 
allowed for a hierarchically organized information retrieval scheme. 


nization of reading material on the level 
and structure of substantive memory for 
that material. There is much work in 
human learning today that points up the 
importance of organization in learning and 
recall of verbal materials (e.g., for word list 
learning, Mandler, 1967, and Bower, 1970; 
for prose learning, Frase, 1969; for pro- 
grammed instruction, Payne, Krathwohl, & 
Gordon, 1967). This is not a new topic in 
human learning research. There are impor- 
tant contributions to organization and 
memory in the work of Bartlett (1932), 
Katona (1940), and others. However, this 
classical literature has recently been rein- 
lerpreted in terms of information-process- 
ing views of learning and memory based on 
ideas from research in automata, com- 
puters, and information retrieval systems 
(Neisser, 1967; Norman, 1970). The present 
Work is discussed from this contemporary 
Point of view. In the information-processing 
— 

1 "This research was sponsored by the National 
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followed the organization of the last passage they read. It was argued 
This article considers the effects of orga- 
| 


framework, the problem of learning from 
reading might be described in terms of (a) 
the reader "encoding" the information in 
reading material into some sort of memory 
structure, and (b) the development of a 
program or strategy to retrieve the infor- 
mation from memory. 

To illustrate the application of informa- 
tion-processing notions to problems of orga- 
nization and memory for text, consider the 
materials used in the present study which 
are outlined in Table 1. The text described 
the characteristics of several large schools 
of fish that were recently discovered in 
Lake Sydney. The fish varied in three di- 
mensions—color, depth at which they were 
found, and diet. There are two reasonable 
ways to organize the information in text 
form: First, the material can be organized 
by dimension or attribute, presenting the 
information column by column. Alterna- 
tively, the information can be organized by 
name, presenting the information row by 
row. The learner’s task is to remember all 
of the fish-attribute associations in the 
table. Which of the two organizations 
should produce superior learning? 

* Tt ean be argued (after Frase, 1969) that 
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TABLE 1 
DIMENSIONS oF THE READING Passage 
Attribute 
Name 
of fish RN rai Ogee Da TREE p S es 
Color Depth Diet 
Hatfish orange | 200 fathoms | plankton 
Loopfish orange | 200 fathoms | spawn 
Arefish orange | 600 fathoms | algae 
Bonefish blue 400 fathoms | plankton 
Pinfish blue 400 fathoms gae 
Scalefish blue 600 fathoms | spawn 


the name-organized passage should produce 
better recall because it presents fewer de- 
mands on input processing and memory 
mechanisms during reading, and allows the 
information to be coded into memory with 
greater ease than- the attribute-organized 
passage. For example, a typical sentence in 
the name-organized passage reads, “Hat- 
fish, which are orange, live at a depth of 
two hundred fathoms and feed primarily on 
plankton.” Thus, in reading and remember- 
ing the name-organized passage, subjects 
could store one name, memorize a list of 
attributes associated with that same, then 
Store another name, etc, The subjects read- 
ing an attribute-organized passage would 
have to store all of the names and possibly 
several values of one attribute before going 
on to the next attribute. The name-orga- 
nized passage “would represent the least 
amount of change from sentence to Sentence 
&nd permit relatively direct classification of 
the information by concept name [Frase, 
1969, pp. 400-401].” 

Alternatively, it could be argued that the 
attribute-organized passage should produce 
better recall because it presents a better re- 
trieval scheme for the learner than the 
name-organized Passage. The attribute-or- 
ganized passage leads to a hierarchically 
organized retrieval scheme which has been 
shown to be very effective in recalling lists 
of words (Bower, Clark, Lesgold, & Win- 
zenz, 1969). Thus, the learner could first 
recall the superordinate categories that are 
the dimensions of the passage. After recall- 
ing the dimension, he could recall the subor- 
dinate categories that are the attribute val- 
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ues of the dimensions. The fish names asso- 
ciated with each value are at the lowest 
level. Thus, in recalling the information, the 
learner might recall the dimension color, 
then the value orange, and the three names ' 
associated with it; then blue and the three 
names associated with it. Recall would then 
move to the next dimension, depth, and its 
values, etc. For the name-organized pas- 
Sage, first the name of the fish must be 
retrieved, followed by all the values a880- 
ciated with each fish name. One of the main 
purposes of this research is to present data 
bearing on these alternative interpretations, 

Another variable of interest in this re- 
search is the effect of changing the passage 
organization on successive readings. How 
should this affect recall? Again, there are - 
several possible information-processing in- 
terpretations. First, it might be argued that 
if the organization of the passage induces a | 
retrieval scheme on the part of the learner, 
then changing the organization on succes- 
sive readings should retard learning because 
the learner must develop different retrieval 
schemes on each reading. This is indeed the 
result with word lists (Bower, Lesgold, & 
Tieman, 1969). 

Alternatively, it could be argued that 


changing the organization of the material | 


on successive readings should facilitate re- 
call since attention or mathemagenie re- 
Sponses to the text will be maintained at à 
high level by the changed text (Rothkopf, 
1965). à 

Another prediction in favor of superior 
learning induced by changed organization 
can be made from the notions of stimulus 
variability (Estes, 1959). Presenting the in- 
formation in two different contexts might 
lead to multiple copies of the associations in 
memory and thus greater availability for 
recall. ; 

The effect of organization in memory in- 
duced by the structure in the reading mate- 
rial and the effects of changed structure on 
successive readings should certainly vary 88 
& function of the level of learning. In this 
study, we assessed the effect of this variable 
by varying the time the reader is allowed to 
study the passage. 
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MeErtHoD 
Design and Materials 

‘As outlined in Table 1, the reading passages 
! described six imaginary fish that differed along 
three dimensions. In addition to the name-attribute 
associations, shown in Table 1, there were four 
other items of information about the fish. These 
were included in the passages to make them more 
natural, and to test the retention of extraneous 
material. The variables of interest in this study 
were (a) type of organization on first reading, (b) 
organization on second reading, (c) syntax of read- 
ing material, and (d) study time. 

Two different organizations were used: those 
that were name organized and those that were at- 
tribute organized. In the name-organized passage, 
all attributes of a given fish were described before 
the next fish was introduced. The name-organized 
passage read as follows: 


Large schools of rare fish have recently been dis- 
covered in Lake Sydney. Hatfish, which are 
orange, live at a depth of two hundred fathoms 
and feed primarily on plankton. Scalefish are 
blue and live at six hunderd fathoms, feeding pri- 
marily on spawn. The relative sizes of the fish 
increase with depth. Bonefish are blue, live at 
four hundred fathoms, and feed primarily on 
plankton. Arcfish, which are orange, live at six 
hundred fathoms and feed primarily on algae. 
Occasionally a very hungry fish might attack a 
smaller fish. Loopfish are orange and live at a 
depth of two hundred fathoms, feeding primarily 
on spawn. Pinfish are blue, live at four hundred 
fathoms, and feed primarily on algae. Marine 
biologists are intrigued by eerie glows emitted by 
these fish. 


In the attribute-organized passage, all fish as- 
sociated with given attribute values in each di- 
mension were named before introducing the next 
dimension and its values. It read as follows: 


Large schools of rare fish have recently been dis- 
covered in Lake Sydney. Fish that live at a 
depth of two hundred fathoms are hatfish and 
loopfish. Bonefish and pinfish live at four hun- 
dred fathoms. The deepest fish are scalefish and 
arcfish, which live at a depth of six hundred 
fathoms, The relative sizes of the fish increase 
with depth. Loopfish and scalefish feed primarily 
on spawn. Plankton is the primary food of hat- 
fish and bonefish. Arcfish and pinfish seem to 
Prefer algae. Occasionally a very hungry fish 
might attack a smaller fish. The color of loopfish, 
hatfish and arcfish is orange. Blue is the color of 
pinfish, scalefish, and bonefish. Marine biologists 
ut intrigued by eerie glows emitted by these 


The sentences used in the two passages are 
Structurally different. This is a consequence of 
Taking the passages read as naturally as possible. 
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To determine whether the effects of organization 
depended on these differences in syntax, two other 
name- and attribute-organized passages were con- 
structed from a single set of sentences. These 
passages differed only in sentence order. For these 
passages, all of the sentences describing the fish 
were simple, declarative sentences. The subjects of 
these sentences were the names of the fish, and the 
predicates were the attribute descriptions, In all 
other respects, including the placement of the ex- 
traneous information, these passages were orga- 
nized like the natural text. The name-organized 
passage constructed of simple sentences read as 
follows: 


Large schools of rare fish have recently been dis- 
covered in Lake Sydney. Hatfish are orange. 
Hatfish live at a depth of two hundred fathoms. 
Hatfish feed primarily on plankton. Scalefish are 
blue. Scalefish live at a depth of six hundred 
fathoms. Scalefish feed primarily on spawn. The 
relative sizes of the fish increase with depth. 
Bonefish are blue. Bonefish live at a depth of four 
hundred fathoms. Bonefish feed primarily on 
plankton. Arefish are orange. Arcfish live at a 
depth of six hundred fathoms. Arcfish feed pri- 
marily on algae. Occasionally a very hungry fish 
might attack a smaller fish, Loopfish are orange. 
Loopfish live at a depth of two hundred fathoms. 
Loopfish feed primarily on spawn. Pinfish are 
blue. Pinfish live at a depth of four hundred 
fathoms. Pinfish feed primarily on algae. Marine 
biologists are intrigued by eerie glows emitted by 
the fish. 


Half of the subjects were run using the natural 
passages, and half of the subjects were run using 
passages consisting of simple sentences. 

‘Also varied was the amount of time subjects 
were allowed to study the passages on each reading. 
Two different study times were used: two minutes 
and three minutes, Half of the subjects were run 
under each study time condition. 

All subjects read the information twice. For half 
of them, the passage organization on. the second 
reading was the same as the first reading. For the 
other half of the subjects, the passage organiza- 
tion was switched on the second reading. . 

The four variables—Passage organization on first 
reading (name or attribute organized); syntax 
(natural or simple) ; study time (2 or 3 minutes); 
and passage organization on the second reading 
(same or switched), were combined ina2X2xX 
2 x 2 factorial design with 16 independent groups. 
There were 14 subjects in each group. 

Free-recall tests, cued-recall tests, and, recogni- 
tion tests were given to provide measures of reten- 
tion and organization. 


Subjects 


The subjects were 224 introductory psychology 
students at UCLA. Students are required to serve 
as subjects in a certain number of experiments, but 
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were allowed to select the experiments in which 
they served. The sign-up sheets for this experiment 
were entitled “Reading Comprehension - Experi- 
ment." 


Procedure 


- The instructions, reading, and test materials 
were assembled into test booklets. The subjects 
were run in groups varying in size from 5 to 20, and 
were randomly assigned to experimental conditions 
except that only one of the study-time conditions 
was used at a given test session. Subjects were 
paced through the test booklet by the experi- 
menter, Instructions stated that the experiment was 
concerned with how people learn from reading. The 
subjects were told that they would have two read- 
ing and test periods and that they would be asked 
to write down the information they learned from 
the passage. They were told not to try to memorize 
the passage, word-for-word, but to try to remember 
the information in the passage. They were not al- 

;lowed to write during the study periods, were in- 
structed not to turn pages until told to do so, and 
were told not to look back to previous pages at any 
time. 

After the first study period, there was a 90-sec- 
ond period in which the subjects answered some 
forced-choice personality questions. This 90-second 
delay before recall was inserted to limit the effects 
of recency on recall order. The subjects were then 
allowed 5 minutes for the first recall test. They 
were instructed to write down everything they 
could recall from the reading passage, in any order 
they wished. They were told that they did not have 
to use complete sentences, but that they should try 
to state specific facts rather than generalities. The 
second study period of two or three minutes fol- 
lowed this recall. The subjects then worked on more 
forced-choice personality ratings for 90-seconds, 
followed by the second 5-minute recall test period. 
They were then given a cued-recall test which con- 
sisted of a table with the columns labeled (as in 
Table 1) with name of fish, depth, color, and diet, 
and were asked to fill in the six rows. Three min- 
utes were allowed for this, Finally, subjects were 
given a recognition test that consisted of 42 true- 
false and multiple-choice items. 


Scoring 


The number of correct associations and the num- 
ber of items recalled correctly provided retention 
scores for the free-recall and cued-recall (table- 
completion) tests. Free-recall protocols were scored 
as follows: Any statement, in order to contain in- 
formation relevant to the passage content, must 
assert (a) a relationship between a name and an at- 
tribute value; or (b) a statement equivalent to one 
of the four extraneous sentences in the text. In de- 
termining clustering indexes, the latter statements 
were ignored, The ‘assertions were listed sequen- 
tially in the order in which the subject had written 
them, and coded according to the relationship ex- 
pressed (regardless of the correctness of the as- + 
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sertion made). The. organization or clustering 
indexes were then computed as described by Frase 
(1969). To calculate the index for attribute cluster- 
ing, the three attribute columns of Table 1 were 
coded by the digits 1, 2, and 3. The sentences of the 
passages, which described a relationship between 
names and attributes, could be coded by sequences 
of these digits. When this was done for the entire 
passage, an 18-digit number representing the se- 
quences of attribute descriptions in each passage 
was obtained, The amount of attribute clustering 
was determined by counting the number of times 
a digit was repeated consecutively (R) and divided 
by the total number of attribute sentences (T), 
less the number of categories used (K). K is sub- 
tracted from T since the first attribute description 
about each name cannot be a repetition. In the 
name-organized passage, T = 22, K = 3, and R = 
0 so that the attribute cluster index equals (R/ 
(T — K)) X 100 = 0%. For the attribute-organized 
passage, the attribute clustering index is 100%. A 
similar index can be computed for name clustering. 
The percentage of organization by name is 100% 
for Passage 1 and 0% for Passage 2. The recogni- 
tion test was scored by giving one point for each 
correct response, 


RESULTS 


Total Associations Recalled on 
Free-Recall Tests 


An analysis of variance of these scores 
showed significant effects at the .05 level 
or better for the organization of first and 
second readings, study time, and recall-trial 
number. However, the effects of the syntax 
variable, whether the text was natural or 
consisted of simple sentences, were minimal 
and variable and did not approach the .05 
level of significance; neither did any inter- 
actions involving the syntax variable ap- 
proach the .05 level. ^ 

Figure 1 shows the mean correct associa- 
tions recalled on the first and second tests 
as a function of organization on the first 
and second recall trials. The results are col- 
lapsed over the syntax variable. The left 
and right panels show results for the 2-min- 
ute and 3-minute study-time conditions, re- 
spectively. ^ 

As expected, the three-minute conditions 
generally achieved a higher level of recall. 
For both study-time conditions, the AA 
condition, which had the attribute-orga- 
nized passages on both readings, achiev i 
the highest performance. level. Individua 
comparison on the second recall, using Tu- 
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2 min STUDY TIME 


NO. CORRECT ASSOCIATIONS RECALLED 
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3 min STUDY TIME Aa 


2 


RECALL TEST 


Fic. 1. Mean number of correct associations recalled on the free-recall tests. (Conditions 
are labeled according to their passage organization [A = attribute, N = name] on their first 


and second readings.) 


key’s tests run at the .05 level, showed that 
for the 2-minute condition, Group AA's re- 
[ call was reliably different from the other 
| conditions, which did not differ significantly 
among each other. For the 3-minute condi- 
tions, recall in the AA condition was signifi- 
cantly superior to the NN condition. These 
results for the 3-minute condition were a 
close replication of the pilot study for this 
experiment, which used the same natural 
text as in this study and similar experimen- 
tal conditions. 

Recall of the four extraneous items alone 
was analyzed in a separate analysis of vari- 
ance. There were no statistically reliable 
differences (p > .05) among the various ex- 
perimental conditions. 


Organization of Free Recall 


, Using the scoring method described ear- 
lier, name- and attribute-clustering indexes 
were computed for each protocol. There was 
a very high negative correlation between 
the name and attribute scores, and most 
scores were close to 0% or 100%. Only 54 of 
the 448 pairs of scores had one member that 


-first and second 


was neither 0% or 100%. Therefore, each 
protocol was classified as name- or attri- 
bute-organized according to which index 
had the highest value. The number of name- 
and attribute-organized protocols for each 
condition are shown in Table 2. Passage or- 
ganization has a large effect on subjects’ 
organizational output, both on the first and 
second recall (p(x) < .01 for both first and 
second recall). The recall organization gen- 
erally followed the organization of the pas- 
sage that was just read, with the attri- 
bute-organized passage showing larger 
effects than the name-organized passage. 
The effect of passage organization was 
greater on the second reading than on the 
first. 


Cued Recall 

The cued-recall test required subjects to 
fill in the name-attribute associations in a 
blank table with the columns Jabeled as in 
Table 1. An analysis of variance of total 
correct, responses yielded significant effects 
(p > .05) for passage organization on the 
readings. The effect of 
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TABLE 2 


Frequency or NAME-(N) AND ATTRIBUTE- 
(A) OnaaNizED FnEE-RECALL PROTOCOLS 


2-minute study time | 3-minute study time 


Condition 


First test |Second test| First test |Second test 


Study time was not significant, but there 
was a significant interaction (p « .01) be- 
tween study time and second reading orga- 
nization. As in the analysis of free-recall 
results neither the syntax variable nor its 
interactions with the other variables ap- 
proached significance (p > .05). The mean 
correct responses on the cued-recall test are 
shown in Table 3. Comparisons of the 
groups within each study-time condition 
using Tukey’s test indicated that the AA 
group was significantly superior to the NN 
and AN groups (p < .05) in the 2-minute 
and 3-minute conditions. 


Recognition Tests 


The differences among recognition scores 
were not great, and an analysis of variance 
yielded no significant effects (p > .05). 


Discussion 


Differences in organization of the text 
produced sizable differences in recall. 
Groups who studied attribute-organized 
passages on both readings had significantly 
higher recall scores in both study-time con- 
ditions. Since the syntax variable yielded 
no reliable effects, these differences can rea- 
sonably be attributed to differences in the 
organization of the information in the two 
passages. It can also be seen in Figure 1 
that the effects of organization were greater 
on the second recall. Frase (1969) has re- 
ported a similar effect of organization on 
prose learning. 

The effects of variations in study time 
are similar to the differences obtained be- 
tween the two recall trials. Thus, increasing 
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the study time led to a higher level of re- 
call, and increased differences in level of 
recall in favor of the groups having the at- 
tribute-organized passages on both read- 
ings. 

Passage organization also had a strong 
effect on the organization of recall with the 
recall organization generally following the 
last read passage. 

The interpretation of these results in in- 
formation processing terms is that the at- 
tribute-organized passage produced better 
recall because it suggested a more efficient 


retrieval strategy in terms of hierarchically’ 


organized categories of information. That 
changing the organization produced a lower 
level of recall again points up the impor- 
tance of retrieval strategies in recall of in- 
formation from text. Changing the organi- 
zation of the text from attribute organiza- 
tion produces poorer recall because subjects 
were learning a new retrieval strategy for 
the changed organization. 

No reliable differences in forced-choice 
recognition scores were obtained. This may 
indicate that our test was not very sensi- 
tive. However, our result that organization 
has little effect on recognition scores is also 
consistent with much recent work in verbal 
learning (e.g., Estes and DaPollito, 1967) 
which suggests a qualitative difference be- 
tween recall and recognition processes. The 
main difference is that retrieval processes 
are more important in recall than in recog- 
nition. Kintsch (1970) summarizes some of 
this work on word lists by stating that 


The single memory trace appears to be the appro- 
priate unit of analysis in the case of recognition 
memory, while recall is determined by interrela- 
tionships among items... . [p. 279]. 


TABLE 3 
Mean Correct RESPONSES ON THE 
Cuzp-RzcaLL TEST 


és Study time 
'assage cnram c roce EET 
er 2 minutes 3 minutes 
AA 19.3 20.6 
AN 16.5 17.4 
NA 16.1 18.1 
NN 16.2 15.0 
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There is an important point we wish to 
make about our results and the application 
of information-processing approaches to 
memory. We do not interpret our results as 
meaning that attribute organization is al- 
ways superior to name organization. 
Rather, in determining the effectiveness of a 
particular organization, it is important to 
look very carefully at the information- 
processing demands of the task, and how 
these interact with the subject’s informa- 
tion-processing capabilities. Thus, many 
workers have made the distinction between 
the control processes of memory and the 
structural components of memory (eg, 
Friedman, Trabasso, & Mosberg, 1967; but 
most clearly by Atkinson & Shiffrin, 1968). 
The control processes can be varied to meet 
the demands of the task within the struc- 
tural information processing limitations of 
the memory system. To make this point 
clear, it will be helpful to compare our re- 
sults with those obtained by Frase (1969), 
who also compared attribute- and name-or- 
ganized passages. Frase found no differences 
between attribute and name text organiza- 
tion, though both were superior to a random 
organization. Frase also found a greater 
tendency towards name organization in the 
subjects’ recall protocols. These results are 
to be contrasted with the present results 
which indicate superiority of attribute or- 
ganization in recall and a greater tendency 
for subjects to use attribute organization. 

However, there are important differences 
between the materials used in the two stud- 
ies that lead to different information-proc- 
essing requirements. Frase’s text describes 
the characteristics of the six chess pieces in 
terms of eight dimensions—their moves, 
point values, ete. Thus, although there were 
Six names used in both studies, Frase used 
eight dimensions as compared with our 
three dimensions. This leads to different in- 
formation-processing requirements in the 
two tasks. In Frase’s name-organized ver- 
sion, the name of each chesspiece was re- 
peated eight times in eight successive sen- 
tences, and there was less change from sen- 
tence to sentence. Sequences of sentences in 
the attribute-organized passages had as 
many as five different names and attribute 
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values in succession. The differences be- 
tween attribute- and name-organized pas- 
sages is much less striking in the present 
study. 

The differences between the number of 
dimensions in the two studies leads to other 
differences in the information processing re- 
quirements. On the basis of his work on 
organization and memory for lists of words, 
Mandler (1967) has suggested that the 
basic limit of the organizing system for 
memory (which might be considered as a 
structural feature of memory in the frame- 
work of Atkinson and Shiffrin, 1968) is 5 + 
2 categories at each level of the hierarchical 
structure. Since eight dimensions were used 
in Frase’s study, it may be that subjects 
could not use the attribute organization of 
the passage to develop an efficient retrieval 
scheme. 

A third difference between the two studies 
in terms of information-processing require- 
ments is in the saliency or familiarity of the 
dimensions used in the two studies. In the 
present study, the dimensions of color and 
depth and diet are both familiar and rea- 
sonably descriptive for fish. In Frase’s 
study, the dimensions of piece (whether or 
not it is a pawn), point value, moves, cap- 
tures, etc. are not very familiar categories 
to persons unversed in chess, and thus may 
not provide good retrieval cues. In fact, the 
names of the pieces (pawn, king, queen, 
bishop, etc.) are probably more familiar 
and salient to the average reader, and thus 
may provide a better organizing principle 
than the unfamiliar attributes of chess 
pieces. i 

A fourth difference between the two stud- 
ies lies in the actual values of the dimen- 
sions used in the two studies. To provide 
good retrieval cues the values of the dimen- 
sion should be reasonably natural. That 
was certainly the case in the present study, 
but not so for all of Frase's dimensions. 

The point of these examples is simply to 
demonstrate that it is important to consider 
how the information-processing require- 
ments of a task lead the subject to organize 
his learning and retrieval strategies. These 


.examples also point up some interesting 
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variables that might be manipulated in fur- 
ther studies of learning from reading. 
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In a replication and extension of work by Silberman, differential teacher 
behavior toward different students was studied in relation to the 
attitudes teachers held toward those students. Using data on dyadic 
teacher-child interactions collected with the Brophy-Good system, 
contrasting patterns were noted in the ways teachers interacted with 


students toward whom they felt attachment, concern, 


indifference, 


or rejection. Four distinct patterns were observed. The data generally 
confirmed Silberman’s findings, even though the present study was done 
at a different grade level and involved three different types of student 


populations. 


Methodological differences that may explain the dis- 


crepancies which did occur are discussed, along with suggestions for 


related research. 


Jackson, Silberman, and Wolfson (1969) 
have empirically demonstrated that teach- 
ers feel differently about different children 
in their classrooms. Silberman (1969) has 
shown such differential teacher attitudes to 
be associated with differential teacher be- 
havior. Using a sample of 10 female third- 
grade teachers, who had taught in upper- 
middle-class suburban school systems for at 
least 3 years, he obtained responses to the 
following interview items: 

1. Attachment: If you could keep one 
student another year for the sheer joy 
of it, whom would you pick? 

Concern: If you could devote all your 
attention to a child who concerns you 
a great deal, whom would you pick? 
Indifference: If a parent were to drop 
in unannounced for a conference, 
whose child would you be least pre- 
pared to talk about? 

_4. Rejection: If your class was to be re- 
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duced by one child, whom would you 

be relieved to have removed? 
Following these interviews, 20 hours of 
observation data were collected in each 
class to see how teachers treated the stu- 
dents they nominated, and to see what the 
students were like. Profiles of the character- 
istics of the four types of students and of 
the teachers’ behavior toward them are pre- 

sented in the following paragraphs. 


Attachment 


Children in this group were seen as con- 
forming, fulfilling the personal needs of the 
teachers (volunteering, answering questions 
correctly), and making few demands on 
their energies. Even though the teachers 
preferred these students, they did not inter- 
act with them or call on them more fre- 
quently than the others. However, the 
teachers did provide more praise to these 
students and held them up as models to 


their classmates. 


Concern 

Children in this group made extensive but 
appropriate demands upon the teacher's 
time. Of the groups studied, these children 
received the most teacher attention. Teach- 
ers initiated frequent contact and placed 
few restrictions on these children, who were 
allowed to approach them freely in most 
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circumstances. Teachers praised their work 
frequently and were careful to reward ef- 
fort. However, at times, the teachers did 
express their concern directly and openly; 
“I don’t know what to do with you next." 


Indifference 

These children were seldom noticed by 
the teachers and had much less contact with 
them than other children. Other than infre- 
quency and brevity of contacts, no differ- 
ences in teacher behavior toward these 
children were observed. 


Rejection 


The teachers viewed these children as 
making illegitimate or overwhelming de- 
mands upon them. In contrast to the con- 
cern students, these children often received 
criticism when they approached the 
teacher; if concern students could do no 
wrong, rejection students could do nothing 
right. These children were under continual 
surveillance, and much teacher behavior di- 
rected at them involved attempts to control 
their behavior. However, the teachers had 
frequent contact with these children and 
frequently both praised and criticized their 
behavior in public. Interestingly, 8 of the 10 
rejection students were asked to leave the 
room at least once when an observer was 
present. 

Attitudes toward individual students sig- 
nificantly affected the teachers’ behavior, 
although there were differences within the 
attitudes sampled. Teacher concern and in- 
difference were more readily expressed than 
rejection and attachment. Silberman sug- 
gests that the teacher role may interact with 
teacher preferences to prevent the expres- 
sion of rejection and attachment. Feelings 
of indifference and concern present less role 
conflict, and therefore are easier attitudes 
for the teacher to express in the classroom. 

Silberman’s study is an important addi- 
tion to the growing literature on intraclass- 
room differences in teacher-child interaction 
patterns (reviewed in Good & Brophy, 
1971). His results have important implica- 
tions for teacher education and training. 
However, certain aspects of his design sug- 


gest improvements. First, Silberman's atti-. 
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tude data were collected before observa-- | 


tional measures were taken, so that knowl- 


edge of the relevant variables might have 
led teachers to distort their behavior during, 


Observation periods (mask favoritism to- 
ward preferred children, demonstrate con- 


cern for children described as objects of 


special concern, etc.). Second, Silberman 
used only one student in each classroom to: 
represent an attitude group. Teachers may 
be attached to or concerned about a student 
for a variety of reasons, and may show 
their attitudes through various behaviors, 
Third, Silberman’s teachers all worked in 
upper-middle-class suburban schools. It is 
possible that expression of teacher attitudes 
may vary with school or learner character- 
istics. For example, if much more negative 
affect is expressed by students in lower- 
class schools, it may be easier for teachers 
io express rejection there than in middle- 
class schools. 

These considerations were incorporated 
into the present study, which was a replica- 
tion and extension of Silberman's work. 


MzrHOD 


Data Collection 


Data were collected in nine first-grade class- 
rooms that were already involved in a larger study 
of the relationships between teachers’ performance 
expectations and their behavior toward different 
children (Brophy & Good, 1973, in press). There 
were three classes studied in each of three types 
of schools: upper-middle-class white, lower-class 
white, and lower-class black. Teachers were told 
that the investigators were interested in observing 
differences in the classroom behavior of children 
who varied in achievement. In late September, 
each teacher supplied a list that ranked her chil- 
dren, in order, according to the levels of achieve- 
ment she expected. Other than this achievement 
rank and a seating chart, no information was re- 
quested from the teachers until all behavioral ob- 
servation data were collected. This eliminated the 
possibility that knowledge of the relevant attitude 
variables could influence the behavioral data. — . 

Sixteen 2¥2-hour observations were made in 
each classroom with the Brophy-Good dyadic Ht 
teraction observation system (Brophy & Good, 
1970; Good & Brophy, 1970). This system yields ; 
variety of measures of the quality and quantity © 
teacher-child interaction, separately recorded ed 
each child in the class. The resulting data poo 
provided information on the teacher-child inter- 
action patterns of 270 children, based on 40 hours 
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| of classroom observation taken on 16 different days 
| during a 3-month period. 

During September, pairs of observers worked in 
each classroom to establish reliability and to de- 
| sensitize the teachers and children to their presence. 
| After reliability was established (procedures. are 
detailed in Brophy & Good, 1970), the observers 
began to work singly, making observations in 
October, November, and early December. Ob- 
servers had not seen the teachers’ achievement 
rankings, and did not know that attitude data 
would be collected. 

In December, after all classroom observations 
had been completed, teacher attitude data were col- 
lected through a mailed questionnaire. The instruc- 
tions were : 


When you answer these questions, please have 

the class roll in front of you so that the names 

of all the children are before you. Please try to 
name at least three children for each question. 

Children can be named for more than one ques- 

tion. 

The four questions were the same as those used 
by Silberman, except that “If your class was to be 
reduced by a few children, which would you have 
removed?" was substituted for “...whom would 
you be relieved to have removed?" for the re- 
jection item. This was done at the request of the 
school district administration. All nine of the 
teachers responded as requested. 


Data Analysis 


| The raw tabulations were first converted into 
measures designed to eliminate distortion due to 
absences and to allow direct comparison among 
children in the same room. Frequency counts were 
| converted to means, dividing each child's totals by 
the number of observations for which he was 
present. Other measures were percentage scores 
| compiled according to the procedures detailed in 
Brophy & Good (1970). 
The data for each class were then standardized 
(M = 0, SD = 1) to set the nine classes on a com- 
mon scale and eliminate variance due to teacher 
or class differences. Two sets of analyses of variance 
Were then obtained from these standardized score 
distrubutions. First, a series of one-way analyses of 
. variance were performed in which the scores of 
each attitude group were compared with the scores 
of all other children. Thus, these analyses compared 
the scores of 25-30 children with those of the re- 
js aining 240-245, These results are summarized be- 
ow. 
The data were also subjected to a series of two- 
way analyses of variance, in which school as well as 
. Attitude was used as a classifying variable. Since 
Silberman’s data came exclusively from upper- 
middle-class schools, and since the data in this re- 
search came from three quite different schools, the 
Possibility that school effects would interact with 
attitude effects was investigated. These analyses 
Produced some ‘significant interactions. However, 
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TABLE 1 


Sex AND ACHIEVEMENT STATUS OF CHILDREN 
IN THE Four ATTITUDE GROUPS 


Attach- | Indif- - | Rej 
Status ment |ference ben AES 
Sex 

% Boys 44 58 46 68 
Achievement rank 

% in top third 75 0 11 8 


% in middle third 21 50 14 | 29 
% in bottom third 4 | 50 76 63 
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the number obtained was no greater than that 
expected by chance. In addition, no observable 
pattern was found, and no reversals of main effects 
occurred in individual schools. It was concluded that 
the attitude effects were similar across the three 
schools, despite their contrasting populations. 


RzsuLTS* 


Data on the sex and achievement status 
of the children in the four attitude groups 
are given in Table 1. Roughly equal num- 
bers of boys and girls appear in the attach- 
ment, indifference, and concern groups, but 
the teachers nominated twice as many boys 
as girls to the rejection group. Achievement 
status was related to all four attitudes. The 
attachment group was composed mostly of 
high-achieving students, while the other 
three groups were mostly low and average 
achievers. It appears that teachers get to 
know and like high achievers. Children in 
the middle range of achievement appear 
less salient to the teachers; they were men- 
tioned frequently only on the indifference 
item. Low achievers appear mostly as ob- 
jects of teacher concern (especially if they 
are girls) or rejection (especially if they are 
boys). 

Results from analyses of variance com- 
paring each of the four attitude groups (re- 
spectively) to all other children on the 
teacher-ehild interaction variables are pre- 
sented below. First, however, a few terms 
may require explanation (see Brophy & 

? Due to space limitations, an extended table 
summarizing the analysis of variance data could 
not be included in this article. This table is in the 
more extended report that will be sent upon re- 
quest. The findings reported in this article were 
significant at the .05 level or better, except for a 
few (labeled in the text as “trends”) where the 
Jevel of significance was between .05 and .10. 
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Good, 1970, for a detailed presentation of 
the entire system). The terms "process," 
"product," and "choice" refer to types of 
teacher questions. Process questions require 
an explanation of a complex phenomenon 
or of the thinking or problem-solving strat- 
egies used in arriving at a conclusion. Prod- 
uct questions require a single-word or short 
answer, primarily reporting facts from 
memory. Choice questions merely require 
the child to select from among alternatives 
provided by the teacher (yes-no and ei- 
ther-or questions are included here). Gener- 
ally, process questions are more demanding 
than product questions, and both of these 
are more demanding than choice questions. 
A fourth term, self-reference, refers to ques- 
tions about personal or procedural matters 
rather than curriculum matters. 

The terms “open” and “direct” concern 
the way teachers select children to respond 
to questions. An open question is coded 
when the teacher calls on a student with his 
hand up who actively wishes to respond. A 
direct question is coded if the teacher 
names the respondent without waiting for a 
show of hands, or if he calls on a student 
who does not have his hand raised. “Call 
outs” are coded when the respondent calls 
out the answer without waiting for teacher 
recognition. 

The terms process and product are also 
used in coding teacher feedback to children. 
Process feedback is coded when the teacher 
reviews or explains the steps involved in 
reaching the correct solution or response. 
Product feedback is coded when he gives 
the correct answer, but does not explain the 
process. 

When a child gives a wrong answer or 
fails to respond, the teacher is coded for 
whether or not he “stays with” the child 
and provides a second Tesponse opportunity. 
He can do this either by repeating the ques- 
tion or by giving help (rephrasing or giving 
& clue). In contrast to Staying with the 
child, the teacher may end the interaction 
e giving the answer or calling on someone 
else. 


Attachment 


Attachment Students possess certain 
qualities that may endear them to teachers. 
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"These students actively seek out the 
teacher, and they typically initiate contacts 
about work assignments rather merely pro- 
cedural matters. Although attachment stu- 


dents are active in the classroom they do à 


not call out answers significantly more 
often than other students. It appears that 
teachers like students who are bright and 
active, but able to control their intellectual 
curiosity and avoid violating classroom 
norms by calling out answers. 

In comparison with their classmates, 
these students provide substantially more 
correct answers per response opportunity, 
and they make fewer reading errors per 
reading turn. They also give more right an- 
Swers in reading-group question and answer 
periods. In addition, when these students 
don’t know an answer they are more likely 
to try to respond rather than to make no 
response. More evidence that attachment 
students conform to institutional norms can 
be seen in their ratio of behavioral contacts 
to work-related contacts. These students 
had many fewer contacts with the teachers 
over behavioral issues than did their class- 
mates. 

It is easy to see why these children would 
be appealing to many teachers. They ap- 
pear to be bright, hard-working, no-non- 
sense students. How, then, did their teach- 
ers respond to them? Did they treat them 
differently? Apparently the teachers did not 
treat these children in grossly favorable 
ways. Although there were a number of 
measures that show differences, many of 
these are attributable to child behavior, not 
teacher behavior. For example, attachment 
students receive much more total praise for 
their academic work. They also receive less 
criticism and more praise in teacher-initi- 
ated work contacts. However, they are not 
praised significantly more often per correct 
answer than their classmates. Thus, the 
higher total praise given to these students 
may simply reflect the fact that they do 
perform more capably in the classroom. 

There is some evidence that teachers try 
to minimize their contacts with attachment 
students. They show trends toward seeking 
them out less often to discuss their work, 
and toward calling on them directly less 
often. However, the teachers show that in 
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certain ways they do favor the attachment 
students. These students receive more read- 
| ing turns and a greater percentage of proc- 
ess questions. They also receive less process 
feedback, apparently because the teachers 
feel that they understand the work and 
don't need it. 


Concern 


Although not as active as the attachment 
students, the concern students show a trend 
toward initiating more contacts with their 
teachers than most of their classmates do. 
However, their scores on performance qual- 
ity indicators are much lower. They provide 
fewer correct answers per response opportu- 
nity than other students, and make more 
errors per reading turn. When they don’t 
know the correct answer they are more 
likely to take a guess than to remain silent. 

The data clearly show that concern stu- 
dents receive different teacher treatment. 
They receive more opportunities to answer 
questions, both in general class activities 
and in reading groups. Teachers also seek 
out concern children for more private con- 
tacts, both procedural and work related. 

In addition to seeking out these children 
more often, the teachers respond to their 
failures more favorably than they respond 
to the failures of other students. For exam- 
ple, these students received a greater pro- 
portion of process feedback in teacher-initi- 
ated work contacts, indicating remediation 
efforts by the teachers. Also, the teachers 
are more likely to stay with these students 
when they commit reading errors, and they 
show trends toward more frequently asking 
them new questions in the reading group 
after they answer initial questions correctly 
and toward less frequently failing to give 
feedback after their answers. In addition, 
when these students fail to answer reading 
group questions correctly, the teachers are 
more likely to repeat the question than to 
give help. In sum, the teachers were care- 
fully monitoring the performance of con- 
cern students during reading groups, and 
Were pushing them to do their best. The 
trends seen in the reading group data for 
concern students also appear in the data 
from general class activities, but they are 
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weaker and usually not statistically signifi- 
cant. 

Although the teachers seek out concern 
students for more contact and stay with 
them longer following both success and fail- 
ure, they do not praise or criticize these 
students significantly more than their class- 
mates. There is a trend (with some rever- 
sals) toward more frequent praise per suc- 
cess and less frequent criticism per failure, 
but none of the differences reach statistical 
significance. In general, then, the teachers’ 
treatment of concern students reflects con- 
cern with their learning progress (not their 
behavior). This concern is seen in evidence 
of attempts to get the most out of these 
students during discussion and recitation 
and to remediate their deficiencies during 
individual contacts with them. 


Indifference 


The indifference students as a group are 
quite passive in the classroom. They initi- 
ate fewer work and procedure contacts with 
their teachers, and they seldom call out re- 
sponses in general class activities or reading 
groups. When they do not know an answer, 
they are more likely to remain silent than 
to offer a guess. These students respond ad- 
equately when they do answer a question, 
being correct about as frequently as the rest 
of the students. They also seem to be about 
average in frequency of discipline contacts. 
Thus, passivity is the primary observable 
trait shown by these children. 

There are some observable teacher differ- 
ences in interactions with the indifference 
group. These students receive fewer re- 
sponse opportunities than their classmates, 
but this is due to their failure to seek re- 
sponse opportunities rather than to teacher 
discrimination. The teachers ask these stu- 
dents direct questions as often as they ask 
other students. 

However, the teachers initiate individual 
contacts less frequently with this group. 
Their tendency to avoid these children is 
not as great as the children’s tendency to 
avoid them, but it is observable in the data. 
This is especially true for procedural con- 
tacts, although a slight trend exists in work 
contacts also. There is a trend for these 
children to be selected to run errands or 
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perform classroom management and 
maintenance tasks less often. 

The teachers behave favorably toward 
these students when they do have individ- 
ual work contacts with them. They provide 
high rates of process feedback and low rates 
of criticism, suggesting a low-affect-high- 
problem-solving approach. Low affect is 
also seen in the data for total praise and 
criticism of academic performance. The in- 
difference students are lower on both of 
these measures than their classmates. 

In summary, students in the indifference 
group are generally passive and tend to 
avoid contact with the teachers, who in turn 
respond to them in much the same way. 
There is no evidence of teacher attempts to 
go after these students in compensation for 
their tendency to avoid contact. The teach- 
ers respond appropriately (although with 
little affect) when they do contact these 
students, but they show no particular con- 
cern about them. In many ways their treat- 
ment of these students is in sharp contrast 
to their treatment of the concern group, un- 
derscoring the accuracy of the teachers’ 
perceptions of their feelings about both 
groups. 

Rejection 


These children are very active in the 
classroom. They create many more proce- 
dure and work contacts with the teachers, 
and they call out a lot'of answers in reading 
group (but not in general class activity). 
They are similar to their classmates in rates 
of reading errors and percentage of ques- 
lions answered correctly in general class ac- 
tivities. There is a trend for them to answer 
teading-group questions incorrectly a 
higher percentage of the time, however. 

„In addition to being active in academic 
situations, these high-saliency “students 
have an extreme number of behavioral con- 
tacts with their teachers. Thus, these child- 
ren are placing frequent demands upon the 
teachers. How, then, do the teachers react? 

To begin with, these children have many 
fewer public response opportunities than 
their classmates. However, they call out 
more answers than the others, and the 


teachers ask them just as many direct ques-. 


tions. Thus, the difference in total response 
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opportunities is due to the low frequencies 
of open questions being answered by these 
children. This could be attributable either 
to the children (they don’t raise their 
hands) or to the teachers (they don’t call 
on these children when they volunteer). Un- 
fortunately, the data do not tell us which 
situation is the true description. 

There is some evidence from other meas- 
ures that the teachers tend to avoid rejec- 
tion children in public situations. For exam- 
ple, these children receive fewer reading 
turns. Furthermore, the teachers frequently 
fail to give feedback to these students after 
their reading turns and after they respond 
to questions, suggesting that the teachers 
may want to move on quickly to someone 
else. 

The teachers do initiate more individual 
contacts with rejection students. Perhaps 
they prefer to deal with them in private 
situations when possible. However, rejection 
children are more likely than their class- 
mates to be criticized when they seek out 
the teachers for private work contacts, and 
they are generally criticized more for their 
classroom behavior and work. 

Thus, several measures show teachers to 
be rejecting and avoiding this group. 


Discussion 


Although the data in this study were 
drawn from first-grade classrooms and from 
schools representing three distinct socioeco- 
nomic levels, they parallel the data ob- 
tained by Silberman (1969) in most as- 
pects. Particularly, teacher treatment of the 
indifference and the concern students was 
quite similar in both studies. The data sup- 
port Silberman’s conclusion that teachers 
attitudes toward children do correlate with 
differential teacher behavior; however, the 
present data also suggest that all four 
teacher attitudes lead to differential teacher 
behavior. Silberman reported differential 
teacher behavior toward concern and indif- 
ference students, but found little evidence 
that teachers differentially treated students 
they felt attached to or that they. rejected. 

Teachers in this study did interact in dis- 
tinct ways with their attachment students. 
Although there was no gross favoritism, 
these teachers provided attachment stu- 


dents with additional support in subtle 
ways. 

The findings for the concern students 
parallel those of Silberman’s, and the data 
for indifference students confirm, but extend 
somewhat, his conclusions. Specifically, 
both studies found that indifference stu- 
dents do not approach the teacher, nor does 
the teacher approach them. However, it was 
noted in the present study that these child- 
ren were seldom praised or criticized in aca- 
demic work situations, even though their 
performance was similar to other students. 
Thus, these children have little contact with 
the teacher, and when they do have contact, 
it seldom results in strong evaluative com- 
ment. 

The findings for the rejected students dif- 
fer somewhat from the data reported by 
Silberman. He reported that teachers had 
similar contact frequencies with rejected 
students as with others, but that they both 
praised and criticized them more fre- 
quently. However, in this study the teach- 
ers avoided public contacts with these 
children. Also, they often failed to provide 
these students with feedback about their 
work, and when they did provide feedback, 
it was much more likely to involve criticism 
than feedback given to other children. Per- 
haps this discrepancy between findings is 
due to the fact that Silberman's behavioral 
data were collected after attitude informa- 
tion was obtained from the teachers. Also, 
this research distinguished among work, 
procedure, and behavior contacts, while Sil- 
berman lumped them together. In any case, 
it is clear that teachers in this study re- 
jected and avoided rejection students. 

The two-way analyses of variance indi- 
cate that the school environment has little 
effect upon how the different attitude 
groups are treated in the classroom. To the 
extent that teachers do differentiate in their 
behavior toward attachment, concern, indif- 
ference, and rejection children, similar re- 
sults will occur in dissimilar schools. The 
number of schools and teachers studied here 
was small, but the extensive observational 
data taken in the classrooms argue strongly 
that the obtained differences in this sample 
do characterize the real behavior of these’ 
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teachers. In combination with Silberman’s, 
the data suggest that the attitudes teachers 
hold toward students do influence the ways 
in which they interact with those students. 
These data show, as Jackson and Lahad- 
erne (1967) have previously reported, that 
classroom life is an uneven affair, with some 
students receiving much more teacher con- 
tact than others. Teachers’ attitudes toward 
students will affect the quality and quan- 
tity of contacts they have with students. 
More studies in this area are needed, par- 
ticularly at the secondary level, to achieve 
clearer understanding of how teacher atti- 
tudes structure the teacher-child interac- 
tion. 

Teacher attitudes can change, of course, 
especially in response to disconfirming stu- 
dent behavior. Studies of student attributes 
that influence the formation and change of 
teacher attitudes are also needed to comple- 
ment the present line of research. Fesh- 
bach’s (1969) work, for example, showed 
that student teachers prefer conforming and 
passive students. To date, the behavioral 
characteristics of concern and rejection stu- 
dents are largely unexplored. These groups 
appeared similar in the present study, yet 
certain unknown characteristics caused the 
teachers to become concerned about and 
work harder with the first group, but to re- 
ject and avoid the second group. 

Studies of other child characteristics that 
may systematically affect teachers’ atti- 
tudes would also be useful. For example, 
how do teachers respond to the child who 
asks endless but relevant questions, or to 
the very dependent child, or to the class 
clown? Such children may provoke predict- 
able teacher attitudes and behavior. 

Studies of stability in teacher attitudes 
are also needed. This includes stability in 
the attitudes of a single teacher over the 
course of a school year, as well as agree- 
ment across teachers in attitudes toward 
particular students. Where the same child is 
viewed the same way by several successive 
teachers, it is likely that self-fulfilling 
prophecy effects and cumulative effects of 
systematic differential treatment would ap- 
pear. The authors are presently conducting 
a follow-up study of these same children, 
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now in second grade, to provide some data 
on stability across teachers in attitudes to- 
ward the same children. 


REFERENCES 


Bmormy, J, & Goon, T. Brophy-Good System 
(Teacher-Child Dyadic Interaction). In A. Si- 
mon & E. Boyer (Eds.), Mirrors for behavior: 
An anthology of observation instruments con- 
tinued, (1970 supplement) Vol. A. Philadelphia: 
Research for Better Schools, 1970. 

Bropxy, J., & Goon, T. Individual differences: To- 
ward an understanding of classroom life. New 
York: Holt, Rinehart & Winston, 1973 (in press). 

Fesupacu, N. Student teacher preferences for 
elementary school pupils varying in personality 
characteristics, Journal of Educational Psychol- 
ogy, 1969, 60, 126-132. 


STATEMENT OF 


OWNERSHIP, MANAGEMENT 
er au TIER. sens ME to ee 


THOMAS L. GOOD AND JERE E. BROPHY 


Goop, T., & Bzornr, J. Teacher-child dyadic inter- 
actions: a new method of classroom observation, 
Journal of School Psychology, 1970, 8, 131-138, 

Goon, T., & Broruy, J, Analyzing classroom inter- 
action: a more powerful alternative. Education 
Technology, 1971, 11, 36-41. 

Jackson, P. & LaHabERNE, H. Inequalities of 
teacher-pupil contacts. Psychology in the 
Schools, 1967, 4, 204-208. 

Jackson, P., Superman, M., & Worsox, B. Signs of 
personal involvement in teachers' descriptions of 
their students. Journal of Educational Psychol- 
ogy, 1969, 60, 22-27. 

SisERwAN, M. Behavioral expression of teachers’ 
attitudes toward elementary school students. 
Journal of Educational Psychology, 1969, 60, 
402-407. 


(Received August 17, 1971) 


CIRCULATION 


Wein ardolepicel Association, I200 17th St., N. W., Washington, D. C. 20036 
H, Holtzman, University of Texas, Austin, Texas 


[A TOTAL Ne. coris rmereo Be ew Rn] 


VANS Twn DLAUUN ang CAansEna, ETAT. 
1 want suniéPr ios 


10,400 
E TOTAL rao eneun ated 
Eee UN eae USED UR eS m us 
2 Gores OdTNIUTEO To wena AGENTI WT oy DL Pe Gl, 
I" cx 
d, d 9,745, 10613 
Sint Provan, VARIES SEO Arran um m 
I" dam ES 


pif. 
eise 


New 1973 l 
EDUCATIONAL PSYCHOLOGY, 4th Ed. 
James M. Sawrey and Charles W. Telford, 
both of San Jose State College 
The thoroughly revised, updated and 
expanded fourth edition now includes not 
only psychological principles that have broad 
educational significance, but also the work of 
educational psychologists such as Gagne and 
his Conditions of Learning and Bloom and 
Mastery Learning. More humanistic in ap- 
proach, this new edition considers a number 
of broad philosophical issues — New 
Schools, Accountability, Affluence, Informa- 
tion Explosion, to name just a few — while 
retaining its strong socio-psychological 
orientation. The educational and social prob- 
lems of the less advantaged, the acquisition 
of attitudes, beliefs and value systems as well 
as formal classroom learning are all stressed 
in this new edition. In a completely new 
Chapter 6, the authors develop the concepts 
of Piaget and Erickson in terms of their 
educational implications. Other material that 
has been revised or expanded includes 
Chapter 17 on Individual Differences, Chapter 
8 on Cognitive Learning and the sections 
dealing with Creativity in Chapters 8 and 16. 
1973, est. 600 pp. 


READINGS IN EDUCATIONAL 
PSYCHOLOGY: The Causes of Behavior 
Judy F. Rosenblith, Wheaton College, Wesley 

Allinsmith, University of Cincinnati, and 

Joanna P. Williams, University of 

Pennsylvania : 

An abridgement of the educational psy- 

chology portions of the authors' previously 
published THE CAUSES OF BEHAVIOR, 3rd 
Ed., this outstanding collection features all 
of the major theorists and schools of thought 
in the field of educational psychology. 
1973, est. 400 pp. 


Allyn & Bacon, Inc. 


College Divísion, Dept. 893 
470 Atlantic Avenue, Boston, MA 02210 


D 


Bind Your 
Journal Issues 
Into Valuable 

Vigilant Volumes 


Pertinent information you sometimes 
desperately need is too often in the 
Journal issue you cannot find. Single 
copies have a way of getting lost, mis- 
placed or destroyed. It is better for 
you to let us permanently bind each 
Journal into semi-annual or annual 
volumes—then your reference source 
is always complete, organized and in- 

tly at your service. We call them 
“Vigilant Volumes”; they so carefully 
store and provide on instant notice 
80 much knowledge of timely value. 


offi 


dates, journal name and special in- 
signia—plus your name stamped in 
gold leaf. “Vigilant Volumes” are 
handsome library additions too, real 
conversation pieces. These volumes 
are bound in the authorized colors. 


HOW TO ORDER 


Simply ship your journal issues to us 
via parcel i 


PUBLISHER'S AUTHORIZED 
BINDERY SERVICE, Ltd. 


(Authorized Binders of All Journals) 
4440 W. Roosevelt Rd, 
Chicago, Illinois 60624 


DEVELOPMENTAL PSYCHOLOGY 


Bimonthly: Two v 
Subscription Price: $20.00 per year 


AMERICAN PSYCHOLOGICAL ASSOCIATION 
1200 Seventeenth Street, N.W. 
Washington, D.C. 
20036 


permanent part of your professional lib 


especially designed to keep 


Now, the journals of the American P. 


JESSE JONES VOLUME FILES 
Send orders to: 

JESSE JONES CORP. 
P.O. Box 5120 
Philadelphia, Pa. 19141 


Only $4.25 ea. 
3 for $12.00 

6 for $22.00 P.P.D. 
USA Orders Only 


Sychological Association can become an attractive 
rary. These famous Jesse Jones volume files, 
your copies orderly, readily accessible for future references — 


guard against soiling, tearing, wear or mi lacement of copies. 
This durable file will sup, jj r 
23-carat gold lettering mak 


port 150 Ibs. Looks and feels like leather and is washable. The 


es it a fit companion for the most costly binding. 


The new sixth edition of this booklet 
provides relevant up-to-date informa- 
tion on 395 separately administered 
graduate programs in psychology 

in 315 universities and colleges. 


Areas covered include enrollments: 
application for fellowships, scholar 
ships, or assistantships; types of 
assistantships and hours of work; 
government stipends; and post- 
doctoral arrangements. Current 
information has been supplied 

by the department chairman, 

his or her representative, or 

the program director. 


pur 
25. 
AW ondons 


to $15 on less 
must be prepaid. 


I uu total enclosed for 


[___] copies of GRADUATE STUDY IN PSYCHOLOGY FOR 197374 


Order Department 

AMERICAN PSYCHOLOGICAL ASSOCIATION 
1200 Seventeenth Street, N.W. 

Washington, D. C. 20036 


Name 


Address. 


realization that 

contribution 

nderstanding of our 
nental problems 


clude investigations not-only by 
v professionals in architecture 
y aphy, resource management and 
nd sociology. Editors Joachim F 

H. Carson have grouped the 
general environmental 


parts 
1 settings, and 


environmer 


on making 


and paper sessions on 


discussions 


1 in September 1970. Additior 
ignificant problems not dealt with at tit 
) selected to strengthen coverage 


$5.50 


ORDER FORM 


total enclosed for 
copies of Environment and the Social Sciences: perspectives and applications. 


Order Department x 
AMERICAN PSYCHOLOGICAL ASSOCI 
1200 Seventeenth Street, N.W. 
Washington, D.C. 20036 


